zoukankan      html  css  js  c++  java
  • 使用 gravity 做大表的分表操作

      gravity 是摩拜单车出票的一个 异构/同构 数据复制通道软件,提供主流软件的支持,并支持k8s云原生。比较看好它的发展。

      项目地址:郑州同济医院:http://news.39.net/ylzx/zztjyy/

      官方文档:郑州男妇科医院哪家好:http://news.39.net/ylzx/zztjyy/

      gravity的编译和部署不是这里的重点,我们直接跳过。

      gravity的部署:

      cd /root/

      git clone https://github.com/moiot/gravity.git

      cd gravity && make

      mkdir /usr/local/gravity/

      cd /usr/local/gravity/

      cp /root/gravity/bin/gravity /usr/local/gravity/

      配置文件这里先忽略,

      下面是我的架构图:

      

    image.png

      业务场景:

      一个老表,随着业务量增大,考虑到分表,按照 user_id 做hash取模拆分,然后业务层面去做数据CRUD操作。

      数据表如下:

      create database testdb;

      use testdb;

      CREATE TABLE `gravity_t1` (

      `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增id',

      `user_id` int(10) unsigned NOT NULL DEFAULT '0' COMMENT '用户id',

      `s_status` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT '状态',

      PRIMARY KEY (`id`),

      KEY `idx_uid` (`user_id`) USING BTREE

      ) COMMENT = '测试表' ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;

      准备拆分后的4个分表:

      use testdb;

      create table t1_shard1 LIKE gravity_t1 ;

      create table t1_shard2 LIKE gravity_t1 ;

      create table t1_shard3 LIKE gravity_t1 ;

      create table t1_shard4 LIKE gravity_t1 ;

      测试数据库连接方式:

      数据库地址:192.168.2.4

      超级账号: dts

      密码: dts

      假设业务用的普通账号叫rd ,密码无所谓。

      造些测试用的数据:

      for i in {1..10000} ; do

      mysql -hdts -pdts -h 192.168.2.4 -e "insert into testdb.gravity_t1 (user_id,s_status) values ("$RANDOM",'0');"

      done

      结果大致这样:

      [test] > select count(*) from gravity_t1 ;

      +----------+

      | count(*) |

      +----------+

      | 10000 |

      +----------+

      1 row in set (0.007 sec)

      [testdb] > select (user_id%4) as hash_id,count(*) FROM gravity_t1 group by (user_id%4);

      +---------+----------+

      | hash_id | count(*) |

      +---------+----------+

      | 0| 2537 |

      | 1 | 2419 |

      | 2 | 2509 |

      | 3| 2535 |

      +---------+----------+

      4 rows in set (0.009 sec)

      shard1的配置文件,内容如下:

      cat config_shard1.toml

      # name 必填,这里保持每个配置文件的唯一性

      name = "shard1"

      # 内部用于保存位点、心跳等事项的库名,默认为 _gravity , 实测发现这里改了名字也没用,保持默认即可

      internal-db-name = "_gravity"

      #

      # Input 插件的定义,此处定义使用 mysql

      #

      [input]

      type = "mysql"

      mode = "replication"

      [input.config.source]

      host = "192.168.2.4"

      username = "dts"

      password = "dts"

      port = 3306

      #

      # Output 插件的定义,此处使用 mysql

      #

      [output]

      type = "mysql"

      [output.config.target]

      host = "192.168.2.4"

      username = "dts"

      password = "dts"

      port = 3306

      # 路由规则的定义

      [[output.config.routes]]

      match-schema = "testdb"

      match-table = "gravity_t1"

      target-schema = "testdb"

      target-table = "t1_shard1"

      # 这个target-table 代表的是需要写入到的分片名称,每个gravity实例的配置中都需要修改

      开4个窗口演示:

      cd /usr/local/gravity/

      ./bin/gravity -config config_shard1.toml -http-addr ":8083"

      ./bin/gravity -config config_shard2.toml -http-addr ":8184"

      ./bin/gravity -config config_shard3.toml -http-addr ":8185"

      ./bin/gravity -config config_shard4.toml -http-addr ":8186"

      TIPS:

      如果我们此时开了数据库的general_log的话, 能看到gravity到dest端是使用replace into方式插入全量数据的。然后再根据启动时候监听的binlog 实现增量数据的追平操作。

      然后,看下 gravity 自动生成的库,存放都是和数据复制相关的信息:

      [testdb] > show tables from _gravity ;

      +----------------------+

      | Tables_in__gravity |

      +----------------------+

      | gravity_heartbeat_v2 |

      | gravity_positions |

      +----------------------+

      2 rows in set (0.000 sec)

      [testdb] > select * from _gravity.gravity_heartbeat_v2;

      +--------+--------+----------------------------+----------------------------+

      | name | offset | update_time_at_gravity | update_time_at_source |

      +--------+--------+----------------------------+----------------------------+

      | shard1 | 57 | 2020-03-26 16:19:08.070483 | 2020-03-26 16:19:08.070589 |

      | shard2 | 51 | 2020-03-26 16:19:07.469721 | 2020-03-26 16:19:07.469811 |

      | shard3 | 50 | 2020-03-26 16:19:09.135751 | 2020-03-26 16:19:09.135843 |

      | shard4 | 48 | 2020-03-26 16:19:08.448371 | 2020-03-26 16:19:08.448450 |

      +--------+--------+----------------------------+----------------------------+

      4 rows in set (0.001 sec)

      [testdb] > select * from _gravity.gravity_positionsG

      *************************** 1. row ***************************

      name: shard1

      stage: stream

      position: {"current_position":{"binlog-name":"mysql-bin.000014","binlog-pos":28148767,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2600359"},"start_position":{"binlog-name":"mysql-bin.000014","binlog-pos":12866955,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2559919"}}

      created_at: 2020-03-26 16:16:14

      updated_at: 2020-03-26 16:19:26

      *************************** 2. row ***************************

      name: shard2

      stage: stream

      position: {"current_position":{"binlog-name":"mysql-bin.000014","binlog-pos":28155813,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2600366"},"start_position":{"binlog-name":"mysql-bin.000014","binlog-pos":16601348,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2569941"}}

      created_at: 2020-03-26 16:16:31

      updated_at: 2020-03-26 16:19:29

      *************************** 3. row ***************************

      name: shard3

      stage: stream

      position: {"current_position":{"binlog-name":"mysql-bin.000014","binlog-pos":28151964,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2600363"},"start_position":{"binlog-name":"mysql-bin.000014","binlog-pos":20333055,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2579960"}}

      created_at: 2020-03-26 16:16:35

      updated_at: 2020-03-26 16:19:29

      *************************** 4. row ***************************

      name: shard4

      stage: stream

      position: {"current_position":{"binlog-name":"mysql-bin.000014","binlog-pos":28152473,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2600364"},"start_position":{"binlog-name":"mysql-bin.000014","binlog-pos":24076960,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2589987"}}

      created_at: 2020-03-26 16:16:40

      updated_at: 2020-03-26 16:19:29

      4 rows in set (0.000 sec)

      TIPS:

      到这一步,我们的4个分表的数据同步都配好了,我们可以再插入几条数据测试下。

      -- insert into testdb.gravity_t1(user_id,s_status) values ('11111','0');

      -- insert into testdb.gravity_t1(user_id,s_status) values ('11112','0');

      -- 我这里演示就不插了

      原始和拆分表的数据条数对比:

      [testdb] > select (user_id%4) as hash_id,count(*) FROM gravity_t1 group by (user_id%4);

      +---------+----------+

      | hash_id | count(*) |

      +---------+----------+

      | 0 | 2537 |

      | 1 | 2419 |

      | 2 | 2509 |

      | 3 | 2535 |

      +---------+----------+

      4 rows in set (0.009 sec

      select count(*) FROM t1_shard1 where user_id%4=0;

      select count(*) FROM t1_shard2 where user_id%4=1;

      select count(*) FROM t1_shard3 where user_id%4=2;

      select count(*) FROM t1_shard4 where user_id%4=3;

      先做一次对分表中不需要的数据的删除操作,防止后期切换后删除数据量过大:

      delete from t1_shard1 where user_id %4!=0;

      delete from t1_shard2 where user_id %4!=1;

      delete from t1_shard3 where user_id %4!=2;

      delete from t1_shard4 where user_id %4!=3;

      ## 注意:生产环境大表的删除操作,建议使用pt-archiver进行

      然后,再到原始表和分表中查询对比下数据是否一致:

      select (user_id%4),count(*) as hash_id FROM gravity_t1 group by (user_id%4);

      select count(*) FROM t1_shard1 where user_id%4=0;

      select count(*) FROM t1_shard2 where user_id%4=1;

      select count(*) FROM t1_shard3 where user_id%4=2;

      select count(*) FROM t1_shard4 where user_id%4=3;

      然后,等低峰期进行操作。

      1、dba对涉及到的业务账号 对这个大表写权限回收掉

      revoke insert,update,delete on testdb.gravity_t1 from rd@'%';

      flush hosts;

      flush tables;

      2、通知业务方发版,切换数据库连接到4个新表

      3、切换完成后,dba再执行一次删除各个分表脏数据的操作,

      delete from t1_shard1 where user_id %4!=0;

      delete from t1_shard2 where user_id %4!=1;

      delete from t1_shard3 where user_id %4!=2;

      delete from t1_shard4 where user_id %4!=3;

      4、打开4个新表的写权限

      GRANT select,insert,update,delete on testdb.t1_shard1 TO rd@'%';

      GRANT select,insert,update,delete on testdb.t1_shard2 TO rd@'%';

      GRANT select,insert,update,delete on testdb.t1_shard3 TO rd@'%';

      GRANT select,insert,update,delete on testdb.t1_shard4 TO rd@'%';

      5、然后,通知业务方测试。

      6、业务方验证无问题后收工。至此,单表 拆分为分表的操作全部完成。

      7、回退方案,待补充 (打开gravity的双向复制??)

  • 相关阅读:
    poco之查看布局树
    airtest连接真机
    airtest之ADB调用
    airtest连接无线连接设备
    airtest之模拟器连接
    airtest之api、图片处理
    airtest介绍
    fidder保存请求和保存响应结果
    7 js之 条件语句
    IE6/IE7之浮动元素最后一个字母重复Bug
  • 原文地址:https://www.cnblogs.com/sushine1/p/12587559.html
Copyright © 2011-2022 走看看