zoukankan      html  css  js  c++  java
  • 数据切分——Mysql分区表的建立及性能分析

            Mysql的安装方法可以参考:

            http://blog.csdn.net/jhq0113/article/details/43812895

            

            Mysql分区表的介绍可以参考:

            http://blog.csdn.net/jhq0113/article/details/44592865

           

           1.检查你的Mysql是否支持分区

            mysql> SHOW VARIABLES LIKE '%partition%';


           若结果如下,表示你的Mysql支持表分区:

            +-----------------------+-------+

           | Variable_name         | Value |
           +-----------------------+-------+
           | have_partition_engine | YES   |
           +-----------------------+-------+
           1 row in set (0.00 sec)
           
    
                   RANGE分区表创建方式:
    DROP TABLE IF EXISTS `my_orders`;
    CREATE TABLE `my_orders` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '表主键',
      `pid` int(10) unsigned NOT NULL COMMENT '产品ID',
      `price` decimal(15,2) NOT NULL COMMENT '单价',
      `num` int(11) NOT NULL COMMENT '购买数量',
      `uid` int(10) unsigned NOT NULL COMMENT '客户ID',
      `atime` datetime NOT NULL COMMENT '下单时间',
      `utime` int(10) unsigned NOT NULL DEFAULT 0 COMMENT '修改时间',
      `isdel` tinyint(4) NOT NULL DEFAULT '0' COMMENT '软删除标识',
      PRIMARY KEY (`id`,`atime`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    
    /*********分区信息**************/
    PARTITION BY RANGE (YEAR(atime))
    (
       PARTITION p0 VALUES LESS THAN (2016),
       PARTITION p1 VALUES LESS THAN (2017),
       PARTITION p2 VALUES LESS THAN MAXVALUE
    );
            以上是一个简单的订单表,分区字段是atime,根据RANGE分区,这样当你向该表中插入数据的时候,Mysql会根据YEAR(atime)的值进行分区存储。


            检查分区是否创建成功,执行查询语句:

             EXPLAIN PARTITIONS SELECT * FROM `my_orders`

             若成功,结果如下:

            

             

            性能分析:

            1).创建同样表结构,但没有进行分区的表     

    DROP TABLE IF EXISTS `my_order`;
    CREATE TABLE `my_order` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '表主键',
      `pid` int(10) unsigned NOT NULL COMMENT '产品ID',
      `price` decimal(15,2) NOT NULL COMMENT '单价',
      `num` int(11) NOT NULL COMMENT '购买数量',
      `uid` int(10) unsigned NOT NULL COMMENT '客户ID',
      `atime` datetime NOT NULL COMMENT '下单时间',
      `utime` int(10) unsigned NOT NULL DEFAULT 0 COMMENT '修改时间',
      `isdel` tinyint(4) NOT NULL DEFAULT '0' COMMENT '软删除标识',
      PRIMARY KEY (`id`,`atime`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

            2).向两张表中插入相同的数据

          

    /**************************向分区表插入数据****************************/
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,CURRENT_TIMESTAMP());
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2016-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2017-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2018-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2015-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2016-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2017-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2018-05-01 00:00:00');
    
    /**************************向未分区表插入数据****************************/
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,CURRENT_TIMESTAMP());
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2016-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2017-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2018-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2015-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2016-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2017-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2018-05-01 00:00:00');

            3).主从复制,大约20万条左右(主从复制的数据和真实环境有差距,但是能体现出表分区查询的性能优劣)

            

    /**********************************主从复制大量数据******************************/
    INSERT INTO `my_orders`(`pid`,`price`,`num`,`uid`,`atime`) SELECT `pid`,`price`,`num`,`uid`,`atime` FROM `my_orders`;
    INSERT INTO `my_order`(`pid`,`price`,`num`,`uid`,`atime`) SELECT `pid`,`price`,`num`,`uid`,`atime` FROM `my_order`;
         

           4).查询测试 

    /***************************查询性能分析**************************************/
    SELECT * FROM `my_orders` WHERE `uid`=89757 AND `atime`< CURRENT_TIMESTAMP();
    /****用时0.084s****/
    
    SELECT * FROM `my_order` WHERE `uid`=89757 AND `atime`< CURRENT_TIMESTAMP();
    /****用时0.284s****/

          通过以上查询可以明显看出进行表分区的查询性能更好,查询所花费的时间更短。

          分析查询过程:

          EXPLAIN PARTITIONS SELECT * FROM `my_orders` WHERE `uid`=89757 AND `atime`< CURRENT_TIMESTAMP();

                


          EXPLAIN PARTITIONS SELECT * FROM `my_order` WHERE `uid`=89757 AND `atime`< CURRENT_TIMESTAMP();

                  

           

           通过以上结果可以看出,my_orders表查询直接经过p0分区,只扫描了49386行,而my_order表没有进行分区,扫描了196983行,这也是性能得到提升的关键所在。


           当然,表的分区并不是分的越多越好,当表的分区太多时找分区又是一个性能的瓶颈了,建议在200个分区以内。


          LIST分区表创建方式:

          

    /*****************创建分区表*********************/
    CREATE TABLE `products` (
    `id`  bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT '表主键' ,
    `name`  varchar(64) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL COMMENT '产品名称' ,
    `metrial`  tinyint UNSIGNED NOT NULL COMMENT '材质' ,
    `weight`  double UNSIGNED NOT NULL DEFAULT 0 COMMENT '重量' ,
    `vol`  double UNSIGNED NOT NULL DEFAULT 0 COMMENT '容积' ,
    `c_id`  tinyint UNSIGNED NOT NULL COMMENT '供货公司ID' ,
    PRIMARY KEY (`id`,`c_id`)
    )ENGINE=InnoDB DEFAULT CHARSET=utf8
    
    /*********分区信息**************/
    PARTITION BY LIST(c_id)
    (
        PARTITION pA VALUES IN (1,3,11,13),
        PARTITION pB VALUES IN (2,4,12,14),
        PARTITION pC VALUES IN (5,7,15,17),
        PARTITION pD VALUES IN (6,8,16,18),
        PARTITION pE VALUES IN (9,10,19,20)
    );

           可以看出,LIST分区和RANGE分区很类似,这里就不做性能分析了,和RANGE很类似。


           HASH分区表的创建方式:

          

    /*****************分区表*****************/
    CREATE TABLE `msgs` (
      `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '表主键',
      `sender` int(10) unsigned NOT NULL COMMENT '发送者ID',
      `reciver` int(10) unsigned NOT NULL COMMENT '接收者ID',
      `msg_type` tinyint(3) unsigned NOT NULL COMMENT '消息类型',
      `msg` varchar(225) NOT NULL COMMENT '消息内容',
      `atime` int(10) unsigned NOT NULL COMMENT '发送时间',
      `sub_id` tinyint(3) unsigned NOT NULL COMMENT '部门ID',
      PRIMARY KEY (`id`,`sub_id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    /*********分区信息**************/
    PARTITION BY HASH(sub_id)
    PARTITIONS 10;

              以上语句代表,msgs表按照sub_id进行HASH分区,一共分了十个区。


           Key分区和HASH分区很类似,不再介绍,若想了解可以参考Mysql官方文档进行详细了解。


           子分区的创建方式:

          

    CREATE TABLE `msgss` (
      `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '表主键',
      `sender` int(10) unsigned NOT NULL COMMENT '发送者ID',
      `reciver` int(10) unsigned NOT NULL COMMENT '接收者ID',
      `msg_type` tinyint(3) unsigned NOT NULL COMMENT '消息类型',
      `msg` varchar(225) NOT NULL COMMENT '消息内容',
      `atime` int(10) unsigned NOT NULL COMMENT '发送时间',
      `sub_id` tinyint(3) unsigned NOT NULL COMMENT '部门ID',
      PRIMARY KEY (`id`,`atime`,`sub_id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    /*********分区信息**************/
    PARTITION BY RANGE (atime) SUBPARTITION BY HASH (sub_id) 
    (
    		PARTITION t0 VALUES LESS THAN(1451577600)
    		(
    			SUBPARTITION s0,
    			SUBPARTITION s1,
    			SUBPARTITION s2,
    			SUBPARTITION s3,
    			SUBPARTITION s4,
    			SUBPARTITION s5
    		),
    		PARTITION t1 VALUES LESS THAN(1483200000)
    		(
    			SUBPARTITION s6,
    			SUBPARTITION s7,
    			SUBPARTITION s8,
    			SUBPARTITION s9,
    			SUBPARTITION s10,
    			SUBPARTITION s11
    		),
    		PARTITION t2 VALUES LESS THAN MAXVALUE
    		(
    			SUBPARTITION s12,
    			SUBPARTITION s13,
    			SUBPARTITION s14,
    			SUBPARTITION s15,
    			SUBPARTITION s16,
    			SUBPARTITION s17
    		)
    );

             检查子分区是否创建成功:

            EXPLAIN PARTITIONS SELECT * FROM msgss;

            结果如下图:

              

            


  • 相关阅读:
    [ERR] Node 10.211.55.8:7001 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.
    PAT A1137 Final Grading (25 分)——排序
    PAT A1136 A Delayed Palindrome (20 分)——回文,大整数
    PAT A1134 Vertex Cover (25 分)——图遍历
    PAT A1133 Splitting A Linked List (25 分)——链表
    PAT A1132 Cut Integer (20 分)——数学题
    PAT A1130 Infix Expression (25 分)——中序遍历
    PAT A1142 Maximal Clique (25 分)——图
    PAT A1141 PAT Ranking of Institutions (25 分)——排序,结构体初始化
    PAT A1140 Look-and-say Sequence (20 分)——数学题
  • 原文地址:https://www.cnblogs.com/duyinqiang/p/5696362.html
Copyright © 2011-2022 走看看