zoukankan      html  css  js  c++  java
  • Mycat实战之连续分片

    1 按照日期(天)分片:

     从开始日期算起,按照天数来分片 例如,从2017-11-01,每10天一个分片且可以指定结束日期
     注意事项:需要提前将分片规划好,建好,否则有可能日期超出实际配置分片数
    

    1.1 修改配置文件

     #修改rule.xml 添加按日期分片的的分配规则 
     vi rule.xml
    <function name="sharding-by-date" class="org.opencloudb.route.function.PartitionByDate"> 
    		<property name="dateFormat">yyyy-MM-dd</property> <!--日期格式--> 
    		<property name="sBeginDate">2017-11-01</property> <!--开始日期--> 
    	   <property name="sPartionDay">10</property> <!--每分片天数--> 
    </function>
    
     <tableRule name="sharding-by-date"> 
    	 <rule> <columns>create_date</columns> 
    	 <algorithm>sharding-by-date</algorithm> 
    	</rule>
      </tableRule>
    
    
    #修改schema.xml 添加逻辑表   
    vi schema.xml 
    <table name="mycatbydate" primaryKey="ID" dataNode="dn1,dn2,dn3" rule="sharding-by-date"/>
    
    
    #重新加载配置文件
    [root@localhost conf]# mysql -h 192.168.2.130 -P9066 -utest -ptest
    Warning: Using a password on the command line interface can be insecure.
    Welcome to the MySQL monitor.  Commands end with ; or g.
    Your MySQL connection id is 20
    Server version: 5.5.8-mycat-1.5.1-RELEASE-20161130213509 MyCat Server (monitor)
    
    Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
    
    mysql> 
    mysql> 
    mysql> reload @@config;
    Query OK, 1 row affected (0.08 sec)
    Reload config success 
    

    1.2 创建表后插入数据并分析日志

    mysql> create table mycatbydate(id int not null auto_increment primary key,
           create_date datetime,datanode varchar(10));
       
    
    mysql> insert into mycatbydate(create_date,datanode) values('2017-06-01',database());
    

    分析mycat.log日志

    1.3 查询语句1:分片字段是等值运算,分析mycat.log

    mysql> select * from mycatbydate where create_date='2017-11-01';
    +----+---------------------+----------+
    | id | create_date         | datanode |
    +----+---------------------+----------+
    |  1 | 2017-11-01 00:00:00 | db1      |
    +----+---------------------+----------+
    1 row in set (0.03 sec)
    

    分析mycat.log日志

    1.4 查询语句2:分片字段范围查询,分析explain和mycat.log

    这里分别用 where 跟 betwwen and 的方式来测试一下范围查询

    where

    mysql> explain select * from mycatbydate 
    		   where create_date >='2017-11-01' and create_date <'2017-11-10';
    +-----------+-------------------------------------------------------------+
    | DATA_NODE | SQL                                                         |                                            
    +-----------+-------------------------------------------------------------+
    | dn1       | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' |                                          
    |           |               AND create_date < '2017-11-10' LIMIT 100      |                                          
    | dn2       | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' |                                          
    |           |               AND create_date < '2017-11-10' LIMIT 100      |                                          
    |           | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' |                                                                                                     |
    | dn3       |               AND create_date < '2017-11-10' LIMIT 100      |                                           
    +-----------+-------------------------------------------------------------+
    3 rows in set (0.00 sec)
    

    分析mycat.log日志

    between and

    mysql> explain select * from mycatbydate where create_date 
                       BETWEEN '2017-11-01' AND '2017-11-10';
    +-----------+---------------------------------------------------+
    | DATA_NODE | SQL                                               |
    +-----------+---------------------------------------------------+
    | dn1       | SELECT * FROM mycatbydate WHERE create_date       |
    |           |  BETWEEN '2017-11-01' AND '2017-11-10' LIMIT 100  | 
    +-----------+---------------------------------------------------+
    1 row in set (0.00 sec)
    

    分析mycat.log日志

    总结:对于范围查询,where语句是没法优化,从而路由到相应的节点上,但是between and 可以优化,这个有咨询过 lead us,答复是目前代码做不到where优化。。。

    1.5 插入限制以及解决方案

    插入数据,如果超过30天数据提示ERROR 1064 (HY000): Index: 6, Size: 3

    由于配置3个dn,每10天一个分片,从7月1日数据就无法插入,这种配置就存在数据超过实际分配个数,底层dn和实际数据要规划好
    
    mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-12-01',database());
    ERROR 1064 (HY000): Index: 6, Size: 3
    
    mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-11-10',database());
    Query OK, 1 row affected (0.01 sec)
    
    mysql> insert into mycatbydate(create_date,datanode) values('2017-11-20',database());
    Query OK, 1 row affected (0.00 sec)
    
    mysql> insert into mycatbydate(create_date,datanode) values('2017-11-30',database());
    Query OK, 1 row affected (0.00 sec)
    
    #查询数据分布:
    mysql> select * from mycatbydate order by datanode;
    +----+---------------------+----------+
    | id | create_date         | datanode |
    +----+---------------------+----------+
    |  1 | 2017-11-01 00:00:00 | db1      |
    |  2 | 2017-11-10 00:00:00 | db1      |
    |  2 | 2017-11-20 00:00:00 | db2      |
    |  2 | 2017-11-30 00:00:00 | db3      |
    +----+---------------------+----------+
    4 rows in set (0.08 sec)
    

    解决方案:指定分区结束时间,这样分区可以重复使用

    vi rule.xml
    <function name="sharding-by-date" class="org.opencloudb.route.function.PartitionByDate"> 
            <property name="dateFormat">yyyy-MM-dd</property> <!--日期格式--> 
            <property name="sBeginDate">2017-06-01</property> <!--开始日期--> 
            <property name="sEndDate">2017-06-30</property> <!--结束日期-->
            <property name="sPartionDay">10</property> <!--每分片天数--> 
    </function>
    
    @@reload_config
    
    
    
    #重新加载配置文件后可以正常插入数据:
    
    mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-12-01',database());
    ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'
    mysql> insert into mycatbydate(id,create_date,datanode) values(5,'2017-12-01',database());
    Query OK, 1 row affected (0.18 sec)
    
    mysql> insert into mycatbydate(id,create_date,datanode) values(6,'2018-01-01',database());
    Query OK, 1 row affected (0.03 sec)
    
    mysql> insert into mycatbydate(id,create_date,datanode) values(7,'2018-02-01',database());
    Query OK, 1 row affected (0.03 sec)
    
    mysql> insert into mycatbydate(id,create_date,datanode) values(8,'2018-03-01',database());
    Query OK, 1 row affected (0.04 sec)
    
    mysql> insert into mycatbydate(id,create_date,datanode) values(9,'2018-04-01',database());
    Query OK, 1 row affected (0.04 sec)
    
    mysql> 
    mysql> 
    mysql> select * from mycatbydate order by datanode;
    +----+---------------------+----------+
    | id | create_date         | datanode |
    +----+---------------------+----------+
    |  5 | 2017-12-01 00:00:00 | db1      |
    |  2 | 2017-11-10 00:00:00 | db1      |
    |  1 | 2017-11-01 00:00:00 | db1      |
    |  6 | 2018-01-01 00:00:00 | db1      |
    |  7 | 2018-02-01 00:00:00 | db1      |
    |  8 | 2018-03-01 00:00:00 | db1      |
    |  9 | 2018-04-01 00:00:00 | db1      |
    |  2 | 2017-11-20 00:00:00 | db2      |
    |  2 | 2017-11-30 00:00:00 | db3      |
    +----+---------------------+----------+
    9 rows in set (0.14 sec)
    

    2 按照自然月分片

    每个自然月一个分片,需要提前将分片数规划好,建好,否则有可能日期超出实际配置分片数
    

    2.1修改配置文件

    #新增分配规则
    vi  rule.xml   
    <function name="partbymonth"  class="org.opencloudb.route.function.PartitionByMonth">
    			<property name="dateFormat">yyyy-MM-dd</property>
    			<property name="sBeginDate">2017-11-01</property>
    </function>
    
    <tableRule name="sharding-by-month">
    		<rule>
    				<columns>create_date</columns>
    				<algorithm>partbymonth</algorithm>
    		</rule>
    </tableRule>
    
    #schema.xml增加逻辑表
    
    <table name="mycatbymonth" primaryKey="ID" dataNode="dn$1-12" rule="sharding-by-month" />
        <dataNode name="dn1" dataHost="192.168.124.55" database="db1" />
        <dataNode name="dn2" dataHost="192.168.124.55" database="db2" />
        <dataNode name="dn3" dataHost="192.168.124.55" database="db3" />
        <dataNode name="dn4" dataHost="192.168.124.55" database="db4" />
        <dataNode name="dn5" dataHost="192.168.124.55" database="db5" />
        <dataNode name="dn6" dataHost="192.168.124.55" database="db6" />
        <dataNode name="dn7" dataHost="192.168.124.55" database="db7" />
        <dataNode name="dn8" dataHost="192.168.124.55" database="db8" />
        <dataNode name="dn9" dataHost="192.168.124.55" database="db9" />
        <dataNode name="dn10" dataHost="192.168.124.55" database="db10" />
        <dataNode name="dn11" dataHost="192.168.124.55" database="db11" />
        <dataNode name="dn12" dataHost="192.168.124.55" database="db12" />
    

    2.2 mycat重新加载配置文件

    mysql> reload @@config;
    Query OK, 1 row affected (0.10 sec)
    Reload config success
    
    
    #查看逻辑表
    mysql> show tables;
    +------------------+
    | Tables in TESTDB |
    +------------------+
    | customer         |
    | customer_addr    |
    | mycatbydate      |
    | mycatbymonth     |
    | orders           |
    | order_items      |
    | travelrecord     |
    | t_vote           |
    +------------------+
    8 rows in set (0.00 sec)
    

    2.3 创建表后插入数据

    mysql>create table mycatbymonth(id int not null auto_increment primary key, create_date datetime,datanode varchar(10));
    Query OK, 0 rows affected (0.20 sec)
    	-> ;
    mysql> explain select * from mycatbymonth;
    +-----------+--------------------------------------+
    | DATA_NODE | SQL                                  |
    +-----------+--------------------------------------+
    | dn1       | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn10      | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn11      | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn12      | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn2       | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn3       | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn4       | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn5       | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn6       | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn7       | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn8       | SELECT * FROM mycatbymonth LIMIT 100 |
    | dn9       | SELECT * FROM mycatbymonth LIMIT 100 |
    +-----------+--------------------------------------+
    12 rows in set (0.01 sec)
    
    
    mysql> insert into mycatbymonth(id,create_date,datanode) values(1,'2017-11-24',database());
    Query OK, 1 row affected (0.01 sec)
    

    查看mycat日志

    2.4 批量插入,验证插入分区,不能超过一年的数据。

    #批量插入数据:
    insert into mycatbymonth(id,create_date,datanode) values(1,'2017-01-01',database());
    insert into mycatbymonth(id,create_date,datanode) values(2,'2017-01-24',database()); 
    insert into mycatbymonth(id,create_date,datanode) values(3,'2017-02-01',database());
    insert into mycatbymonth(id,create_date,datanode) values(4,'2017-03-01',database());
    insert into mycatbymonth(id,create_date,datanode) values(5,'2017-04-01',database());
    insert into mycatbymonth(id,create_date,datanode) values(6,'2017-05-01',database());
    insert into mycatbymonth(id,create_date,datanode) values(7,'2017-06-01',database());
    insert into mycatbymonth(id,create_date,datanode) values(8,'2017-07-01',database());
    insert into mycatbymonth(id,create_date,datanode) values(9,'2017-08-01',database());
    insert into mycatbymonth(id,create_date,datanode) values(12,'2017-12-01',database());
    
    
    
    #报错的插入        
    mysql> insert into mycatbymonth(id,create_date,datanode) values(13,'2019-01-01','db1');
    ERROR 1064 (HY000): Can't find a valid data node for specified 
    node index :MYCATBYMONTH -> CREATE_DATE -> 2019-01-01 -> Index : 24
    

    2.5 查询验证,查询分区字段等值 以及范围查询同 按照日期(天)分片的方式 不赘述!

    连续分片总结:

    1、针对时间范围查询,采用between可以直接路由到对应分片,从而避免查询所有dn,只查询特定分片(具体原因还没有分
      析),这个问题问过leaderUS,回复时,目前技术上还没法做到普通范围查询的优化。
    2、插入数据存在超过分片情况,针对月分区,插入数据不能超过一年,针对天的,也存在日期超过分区问题
  • 相关阅读:
    deeplearning.ai 卷积神经网络 Week 1 卷积神经网络
    deeplearning.ai 构建机器学习项目 Week 2 机器学习策略 II
    deeplearning.ai 构建机器学习项目 Week 1 机器学习策略 I
    deeplearning.ai 改善深层神经网络 week3 超参数调试、Batch Normalization和程序框架
    deeplearning.ai 改善深层神经网络 week2 优化算法
    deeplearning.ai 改善深层神经网络 week1 深度学习的实用层面
    cs231n spring 2017 lecture8 Deep Learning Networks
    cs231n spring 2017 lecture7 Training Neural Networks II
    cs231n spring 2017 lecture6 Training Neural Networks I
    cs231n spring 2017 Python/Numpy基础
  • 原文地址:https://www.cnblogs.com/chinesern/p/7800954.html
Copyright © 2011-2022 走看看