在现在的mysql cluster 7.2测试版本中,对join的功能进行了重大改进,可以说是一个里程碑的版本,由于前段时间做测试发现mysql cluster 7.1.10对join的性能很不好,另外如果jion的表比较大的时候会导致节点之间的数据传输非常大。
我们知道在7.2之前的版本都是将jion在mysqld节点执行,这样就需要从数据节点进行交互传输数据,造成join的性能非常差。现在7.2版本将join放到了数据节点执行,并行处理,较少了数据在sql节点的交互,因此性能大幅提升。因此很想测试7.2版本中到底对jion进行了多大的改进,于是弄了几个虚拟机,384M内存就上路了,呵呵!由于使用的是虚拟机,测试结果可能会有点偏差,但是还是能说明对join的改进的。
下面就开始mysql cluster 7.2之旅吧!
下载mysql cluster 7.2.0
http://dev.mysql.com/downloads/cluster/#downloads
到上面的连接下载测试版:mysql-cluster-gpl-7.2.0-devmilestone-linux-x86_64-glibc23.tar.gz
关于mysql cluster 7.2.0的安装和配置,和mysql cluster 7.0以及7.1.10是一样的,请参考:
http://blog.chinaunix.net/space.php?uid=20639775&do=blog&id=201960
http://blog.chinaunix.net/space.php?uid=20639775&do=blog&id=154598
搭建好环境之后就可以开始测试了,测试数据请从这里下载:
adaptive_query_example_data.zip 创建表测试
mysql> CREATE DATABASE clusterdb; USE clusterdb;
Query OK, 1 row affected (0.20 sec)
Database changed
mysql> CREATE TABLE residents (id INT NOT NULL PRIMARY KEY, name VARCHAR(20),
-> postcode VARCHAR(20)) ENGINE=ndb;
Query OK, 0 rows affected (0.28 sec)
mysql> CREATE TABLE postcodes (postcode VARCHAR(20) NOT NULL PRIMARY KEY,
-> town VARCHAR(20)) ENGINE=ndb;
Query OK, 0 rows affected (0.23 sec)
mysql> CREATE TABLE towns (town VARCHAR(20) NOT NULL PRIMARY KEY,
-> county VARCHAR(20)) ENGINE=ndb;
Query OK, 0 rows affected (0.19 sec)
导入数据:
mysql> LOAD DATA LOCAL INFILE "/root/zhang/ndb_test/residents.csv" REPLACE INTO TABLE residents FIELDS TERMINATED BY ',' ENCLOSED BY '"';
Query OK, 100000 rows affected, 65535 warnings (14.94 sec)
Records: 100000 Deleted: 0 Skipped: 0 Warnings: 100000
mysql> LOAD DATA LOCAL INFILE "/root/zhang/ndb_test/postcodes.csv" REPLACE INTO TABLE postcodes FIELDS TERMINATED BY ',' ENCLOSED BY '"';
Query OK, 100000 rows affected, 65535 warnings (10.66 sec)
Records: 100000 Deleted: 0 Skipped: 0 Warnings: 100000
mysql> LOAD DATA LOCAL INFILE "/root/zhang/ndb_test/towns.csv" REPLACE INTO TABLE towns FIELDS TERMINATED BY ',' ENCLOSED BY '"';
Query OK, 100000 rows affected, 65535 warnings (10.84 sec)
Records: 100000 Deleted: 0 Skipped: 0 Warnings: 99999
测试
首先将ndb_join_pushdown参数关掉进行测试
mysql> set ndb_join_pushdown=off;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT COUNT(*) FROM residents,postcodes WHERE
-> residents.postcode=postcodes.postcode AND postcodes.town="MAIDENHEAD";
+----------+
| COUNT(*) |
+----------+
| 20000 |
+----------+
1 row in set (28.23 sec)
mysql>
mysql> SELECT COUNT(*) FROM residents,postcodes,towns WHERE
-> residents.postcode=postcodes.postcode AND
-> postcodes.town=towns.town AND towns.county="Berkshire";
+----------+
| COUNT(*) |
+----------+
| 40001 |
+----------+
1 row in set (50.94 sec)
打开ndb_join_pushdown参数进行测试
mysql> set ndb_join_pushdown=on;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT COUNT(*) FROM residents,postcodes WHERE
-> residents.postcode=postcodes.postcode AND postcodes.town="MAIDENHEAD";
+----------+
| COUNT(*) |
+----------+
| 20000 |
+----------+
1 row in set (2.39 sec)
mysql> SELECT COUNT(*) FROM residents,postcodes,towns WHERE
-> residents.postcode=postcodes.postcode AND postcodes.town=towns.town
-> AND towns.county="Berkshire";
+----------+
| COUNT(*) |
+----------+
| 40001 |
+----------+
1 row in set (3.84 sec)
因为我忘记了在mysqld节点关闭缓存,在相同的mysqld节点在进行一次ndb_join_pushdown参数关闭的实验,结果差不多!
mysql> set ndb_join_pushdown=off;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT COUNT(*) FROM residents,postcodes WHERE
-> residents.postcode=postcodes.postcode AND postcodes.town="MAIDENHEAD";
+----------+
| COUNT(*) |
+----------+
| 20000 |
+----------+
1 row in set (28.43 sec)
从测试结果可以看出,7.2对join确实提升非常大,我这里用虚拟机测试的性能提升都能达到15倍左右,回头有服务器了做下测试,肯定比这个要高一点!因为我这里的硬件和网络都不能达到测试的要求,这回大大降低ndb的性能!
希望mysql cluster 7.2 能快点出正式稳定版本!
本文参考:http://www.clusterdb.com/mysql/dramatically-increased-mysql-cluster-join-performance-with-adaptive-query-localization/