zoukankan      html  css  js  c++  java
  • 数据切分——Mysql分区表的建立及性能分析

            Mysql的安装方法可以参考:

            http://blog.csdn.net/jhq0113/article/details/43812895

            

            Mysql分区表的介绍可以参考:

            http://blog.csdn.net/jhq0113/article/details/44592865

           

           1.检查你的Mysql是否支持分区

            mysql> SHOW VARIABLES LIKE '%partition%';


           若结果如下,表示你的Mysql支持表分区:

            +-----------------------+-------+

           | Variable_name         | Value |
           +-----------------------+-------+
           | have_partition_engine | YES   |
           +-----------------------+-------+
           1 row in set (0.00 sec)
           
    
                   RANGE分区表创建方式:
    DROP TABLE IF EXISTS `my_orders`;
    CREATE TABLE `my_orders` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '表主键',
      `pid` int(10) unsigned NOT NULL COMMENT '产品ID',
      `price` decimal(15,2) NOT NULL COMMENT '单价',
      `num` int(11) NOT NULL COMMENT '购买数量',
      `uid` int(10) unsigned NOT NULL COMMENT '客户ID',
      `atime` datetime NOT NULL COMMENT '下单时间',
      `utime` int(10) unsigned NOT NULL DEFAULT 0 COMMENT '修改时间',
      `isdel` tinyint(4) NOT NULL DEFAULT '0' COMMENT '软删除标识',
      PRIMARY KEY (`id`,`atime`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    
    /*********分区信息**************/
    PARTITION BY RANGE (YEAR(atime))
    (
       PARTITION p0 VALUES LESS THAN (2016),
       PARTITION p1 VALUES LESS THAN (2017),
       PARTITION p2 VALUES LESS THAN MAXVALUE
    );
            以上是一个简单的订单表,分区字段是atime,根据RANGE分区,这样当你向该表中插入数据的时候,Mysql会根据YEAR(atime)的值进行分区存储。


            检查分区是否创建成功,执行查询语句:

             EXPLAIN PARTITIONS SELECT * FROM `my_orders`

             若成功,结果如下:

            

             

            性能分析:

            1).创建同样表结构,但没有进行分区的表     

    DROP TABLE IF EXISTS `my_order`;
    CREATE TABLE `my_order` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '表主键',
      `pid` int(10) unsigned NOT NULL COMMENT '产品ID',
      `price` decimal(15,2) NOT NULL COMMENT '单价',
      `num` int(11) NOT NULL COMMENT '购买数量',
      `uid` int(10) unsigned NOT NULL COMMENT '客户ID',
      `atime` datetime NOT NULL COMMENT '下单时间',
      `utime` int(10) unsigned NOT NULL DEFAULT 0 COMMENT '修改时间',
      `isdel` tinyint(4) NOT NULL DEFAULT '0' COMMENT '软删除标识',
      PRIMARY KEY (`id`,`atime`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

            2).向两张表中插入相同的数据

          

    /**************************向分区表插入数据****************************/
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,CURRENT_TIMESTAMP());
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2016-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2017-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2018-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2015-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2016-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2017-05-01 00:00:00');
    INSERT INTO my_orders(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2018-05-01 00:00:00');
    
    /**************************向未分区表插入数据****************************/
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,CURRENT_TIMESTAMP());
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2016-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2017-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89757,'2018-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2015-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2016-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2017-05-01 00:00:00');
    INSERT INTO my_order(`pid`,`price`,`num`,`uid`,`atime`) VALUES(1,12.23,1,89756,'2018-05-01 00:00:00');

            3).主从复制,大约20万条左右(主从复制的数据和真实环境有差距,但是能体现出表分区查询的性能优劣)

            

    /**********************************主从复制大量数据******************************/
    INSERT INTO `my_orders`(`pid`,`price`,`num`,`uid`,`atime`) SELECT `pid`,`price`,`num`,`uid`,`atime` FROM `my_orders`;
    INSERT INTO `my_order`(`pid`,`price`,`num`,`uid`,`atime`) SELECT `pid`,`price`,`num`,`uid`,`atime` FROM `my_order`;
         

           4).查询测试 

    /***************************查询性能分析**************************************/
    SELECT * FROM `my_orders` WHERE `uid`=89757 AND `atime`< CURRENT_TIMESTAMP();
    /****用时0.084s****/
    
    SELECT * FROM `my_order` WHERE `uid`=89757 AND `atime`< CURRENT_TIMESTAMP();
    /****用时0.284s****/

          通过以上查询可以明显看出进行表分区的查询性能更好,查询所花费的时间更短。

          分析查询过程:

          EXPLAIN PARTITIONS SELECT * FROM `my_orders` WHERE `uid`=89757 AND `atime`< CURRENT_TIMESTAMP();

                


          EXPLAIN PARTITIONS SELECT * FROM `my_order` WHERE `uid`=89757 AND `atime`< CURRENT_TIMESTAMP();

                  

           

           通过以上结果可以看出,my_orders表查询直接经过p0分区,只扫描了49386行,而my_order表没有进行分区,扫描了196983行,这也是性能得到提升的关键所在。


           当然,表的分区并不是分的越多越好,当表的分区太多时找分区又是一个性能的瓶颈了,建议在200个分区以内。


          LIST分区表创建方式:

          

    /*****************创建分区表*********************/
    CREATE TABLE `products` (
    `id`  bigint UNSIGNED NOT NULL AUTO_INCREMENT COMMENT '表主键' ,
    `name`  varchar(64) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL COMMENT '产品名称' ,
    `metrial`  tinyint UNSIGNED NOT NULL COMMENT '材质' ,
    `weight`  double UNSIGNED NOT NULL DEFAULT 0 COMMENT '重量' ,
    `vol`  double UNSIGNED NOT NULL DEFAULT 0 COMMENT '容积' ,
    `c_id`  tinyint UNSIGNED NOT NULL COMMENT '供货公司ID' ,
    PRIMARY KEY (`id`,`c_id`)
    )ENGINE=InnoDB DEFAULT CHARSET=utf8
    
    /*********分区信息**************/
    PARTITION BY LIST(c_id)
    (
        PARTITION pA VALUES IN (1,3,11,13),
        PARTITION pB VALUES IN (2,4,12,14),
        PARTITION pC VALUES IN (5,7,15,17),
        PARTITION pD VALUES IN (6,8,16,18),
        PARTITION pE VALUES IN (9,10,19,20)
    );

           可以看出,LIST分区和RANGE分区很类似,这里就不做性能分析了,和RANGE很类似。


           HASH分区表的创建方式:

          

    /*****************分区表*****************/
    CREATE TABLE `msgs` (
      `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '表主键',
      `sender` int(10) unsigned NOT NULL COMMENT '发送者ID',
      `reciver` int(10) unsigned NOT NULL COMMENT '接收者ID',
      `msg_type` tinyint(3) unsigned NOT NULL COMMENT '消息类型',
      `msg` varchar(225) NOT NULL COMMENT '消息内容',
      `atime` int(10) unsigned NOT NULL COMMENT '发送时间',
      `sub_id` tinyint(3) unsigned NOT NULL COMMENT '部门ID',
      PRIMARY KEY (`id`,`sub_id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    /*********分区信息**************/
    PARTITION BY HASH(sub_id)
    PARTITIONS 10;

              以上语句代表,msgs表按照sub_id进行HASH分区,一共分了十个区。


           Key分区和HASH分区很类似,不再介绍,若想了解可以参考Mysql官方文档进行详细了解。


           子分区的创建方式:

          

    CREATE TABLE `msgss` (
      `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '表主键',
      `sender` int(10) unsigned NOT NULL COMMENT '发送者ID',
      `reciver` int(10) unsigned NOT NULL COMMENT '接收者ID',
      `msg_type` tinyint(3) unsigned NOT NULL COMMENT '消息类型',
      `msg` varchar(225) NOT NULL COMMENT '消息内容',
      `atime` int(10) unsigned NOT NULL COMMENT '发送时间',
      `sub_id` tinyint(3) unsigned NOT NULL COMMENT '部门ID',
      PRIMARY KEY (`id`,`atime`,`sub_id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    /*********分区信息**************/
    PARTITION BY RANGE (atime) SUBPARTITION BY HASH (sub_id) 
    (
    		PARTITION t0 VALUES LESS THAN(1451577600)
    		(
    			SUBPARTITION s0,
    			SUBPARTITION s1,
    			SUBPARTITION s2,
    			SUBPARTITION s3,
    			SUBPARTITION s4,
    			SUBPARTITION s5
    		),
    		PARTITION t1 VALUES LESS THAN(1483200000)
    		(
    			SUBPARTITION s6,
    			SUBPARTITION s7,
    			SUBPARTITION s8,
    			SUBPARTITION s9,
    			SUBPARTITION s10,
    			SUBPARTITION s11
    		),
    		PARTITION t2 VALUES LESS THAN MAXVALUE
    		(
    			SUBPARTITION s12,
    			SUBPARTITION s13,
    			SUBPARTITION s14,
    			SUBPARTITION s15,
    			SUBPARTITION s16,
    			SUBPARTITION s17
    		)
    );

             检查子分区是否创建成功:

            EXPLAIN PARTITIONS SELECT * FROM msgss;

            结果如下图:

              

            


  • 相关阅读:
    day3:python测试题
    day4:Python列表(list)元组( tuple)字典(dict)
    day3:python运算符及数据类型(str)(int)
    2day:Python基础
    1day:了解python
    centos下安装storm
    Linux下添加,删除,修改,查看用户和用户组
    svn默认地址老发生改变,记下默认路径
    hive 遇到的问题及解决方法
    hadoop2.5.2 安装与部署
  • 原文地址:https://www.cnblogs.com/duyinqiang/p/5696362.html
Copyright © 2011-2022 走看看