zoukankan      html  css  js  c++  java
  • mysql Partition(分区)初探

    mysql Partition(分区)初探
     
    表数据量大的时候一般都考虑水平拆分,即所谓的sharding.不过mysql本身具有分区功能,可以实现一定程度 的水平切分. 
    mysql是具有MERGE这种引擎的,就是把一些结构相同的MyIASM表作为一个表使用,但是我觉得 MERGE不如partition实用, 
      www.2cto.com  
    因为MERGE会在所有的底层表上查询,而partition只在相应的分区上查询. 
    建立了两个表,分别为分区和未分区的,分区表按年进行分区. 
    Sql代码  
    CREATE TABLE `20130117date_par` (  
      `content` varchar(20) NOT NULL,  
      `create_time` datetime NOT NULL,  
      KEY `20130117date_idx_date` (`create_time`)  
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8  
    PARTITION BY RANGE (YEAR(create_time))  
    (PARTITION p2009 VALUES LESS THAN (2010),  
     PARTITION p2010 VALUES LESS THAN (2011),  
     PARTITION p2011 VALUES LESS THAN (2012),  
     PARTITION p2012 VALUES LESS THAN (2013),  
     PARTITION p2013 VALUES LESS THAN (2014))  
      
    CREATE TABLE `20130117date` (  
      `content` varchar(20) NOT NULL,  
      `create_time` datetime NOT NULL,  
      KEY `20130117date_idx_date` (`create_time`)  
    ) ENGINE=InnoDB  
     
    用sp向分区表和普通表各插入了90w条随机数据. 
    用mysqlslap进行下测试 
     
    不用分区表 
    Sql代码  
    select SQL_NO_CACHE * from 20130117date  
    where create_time BETWEEN '2013-01-01' and '2013-01-02';  
    select SQL_NO_CACHE * from 20130117date  
    where create_time BETWEEN '2012-12-25' and '2013-01-05';  
     
    引用
     
    Benchmark 
            Average number of seconds to run all queries: 0.881 seconds 
            Minimum number of seconds to run all queries: 0.062 seconds 
            Maximum number of seconds to run all queries: 3.844 seconds 
            Number of clients running queries: 1 
            Average number of queries per client: 2 
    Benchmark 
            Average number of seconds to run all queries: 0.703 seconds 
            Minimum number of seconds to run all queries: 0.062 seconds 
            Maximum number of seconds to run all queries: 1.922 seconds 
            Number of clients running queries: 1 
            Average number of queries per client: 2 
    Benchmark 
            Average number of seconds to run all queries: 1.250 seconds 
            Minimum number of seconds to run all queries: 0.109 seconds 
            Maximum number of seconds to run all queries: 4.032 seconds 
            Number of clients running queries: 1 
            Average number of queries per client: 2 
     
     
    用分区表 
    Sql代码  
    select SQL_NO_CACHE * from 20130117date_par  
    where create_time BETWEEN '2013-01-01' and '2013-01-02';  
    select SQL_NO_CACHE * from 20130117date_par  
    where create_time BETWEEN '2012-12-25' and '2013-01-05';  
     
    引用
     
    Benchmark 
            Average number of seconds to run all queries: 0.068 seconds 
            Minimum number of seconds to run all queries: 0.047 seconds 
            Maximum number of seconds to run all queries: 0.110 seconds 
            Number of clients running queries: 1 
            Average number of queries per client: 2 
    Benchmark 
            Average number of seconds to run all queries: 0.250 seconds 
            Minimum number of seconds to run all queries: 0.031 seconds 
            Maximum number of seconds to run all queries: 1.078 seconds 
            Number of clients running queries: 1 
            Average number of queries per client: 2 
    Benchmark 
            Average number of seconds to run all queries: 0.046 seconds 
            Minimum number of seconds to run all queries: 0.046 seconds 
            Maximum number of seconds to run all queries: 0.047 seconds 
            Number of clients running queries: 1 
            Average number of queries per client: 2 
             www.2cto.com  
    看来性能还是有一定的提升的. 
           
    执行 
    Sql代码  
    explain PARTITIONS select * from 20130117date_par  
    where create_time BETWEEN '2012-01-01' and '2012-01-02';  
     
    可以看出这个query只扫描了p2012这个分区. 
    而且分区表的好处在于维护比较方便.比如2009年的数据不需要了,分区表的方法为 
    Sql代码  
    alter table 20130117date_par drop PARTITION p2009  
     
    不到1s就行了 
    普通表为 
    Sql代码  
    delete from 20130117date  
    where create_time BETWEEN '2009-01-01' and '2010-01-01'  
     
    用了10.25s左右
  • 相关阅读:
    scrapy练习1
    sys.path.append()加入当前目录为环境变量
    同济:003.映射与函数3
    1-4 无监督学习(Unsupervised Learning)
    1-3.监督学习(supervised learning)
    同济:002.映射与函数2
    github访问过慢解决
    LeetCode OJ:Contains Duplicate(是否包含重复)
    LeetCode OJ:Maximum Product Subarray(子数组最大乘积)
    LeetCode OJ:Valid Anagram(有效字谜问题)
  • 原文地址:https://www.cnblogs.com/DjangoBlog/p/3992349.html
Copyright © 2011-2022 走看看