一、业务场景分析
只有大表才需要分表,而且这个大表还会有经常需要读的需要,即使经过sql服务器优化和sql调优,查询也会非常慢。例如共享汽车的定位数据表等。
二、实现步骤
1.准备pom依赖
<dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.30</version> </dependency> <dependency> <groupId>com.alibaba</groupId> <artifactId>druid</artifactId> <version>1.1.3</version> </dependency> <dependency> <groupId>org.springframework</groupId> <artifactId>spring-jdbc</artifactId> <version>4.3.11.RELEASE</version> </dependency> <dependency> <groupId>org.mybatis</groupId> <artifactId>mybatis-spring</artifactId> <version>1.2.1</version> </dependency> <dependency> <groupId>org.mybatis</groupId> <artifactId>mybatis</artifactId> <version>3.2.3</version> </dependency>
这里关键是要额外引入 插件shardbatis 相关的依赖,主要有两个:
<dependency> <groupId>org.shardbatis</groupId> <artifactId>shardbatis</artifactId> <version>2.0.0B</version> </dependency> <dependency> <groupId>net.sf.jsqlparser</groupId> <artifactId>jsqlparser</artifactId> <version>0.7.0</version> </dependency>
2.准备表
把原来的t_location单表拆分成t_location_01、t_location_02、t_location_03、t_location_04、t_location_05、t_location_06
3.准备好mybatis的mapper interface
public interface UserMapper { int deleteByPrimaryKey(Integer id); int insert(User record); int insertSelective(User record); User selectByPrimaryKey(Integer id); int updateByPrimaryKeySelective(User record); int updateByPrimaryKey(User record); }
对应的sql这里就省略了,shardbatis这个插件使用时也不需要去调整实际的sql,插件达到的效果就是替换掉实际sql中的表名。
4.新增一个shard_config.xml文件
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE shardingConfig PUBLIC "-//shardbatis.googlecode.com//DTD Shardbatis 2.0//EN" "http://shardbatis.googlecode.com/dtd/shardbatis-config.dtd"> <shardingConfig> <!-- parseList可选配置 如果配置了parseList,只有在parseList范围内并且不再ignoreList内的sql才会被解析和修改 --> <ignoreList> <value>xxx.xxx</value> </ignoreList> <parseList> <value>xxx.dao.UserMapper.insertSelective</value> <value>xxx.dao.UserMapper.selectByPrimaryKey</value> <value>xxx.UserMapper.updateByPrimaryKeySelective</value> </parseList> <!-- 配置分表策略 tableName指的是实际的表名,strategyClass对应的分表策略实现类 --> <strategy tableName="location" strategyClass="xxx.DeviceShardStrategyImpl"/> </shardingConfig>
并在项目的mybatis-config.xml里声明使用这个插件,比如
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE configuration PUBLIC "-//mybatis.org//DTD Config 3.0//EN" "http://mybatis.org/dtd/mybatis-3-config.dtd"> <configuration> <plugins> <plugin interceptor="com.google.code.shardbatis.plugin.ShardPlugin"> <property name="shardingConfig" value="shard-config.xml"/> </plugin> </plugins> </configuration>
5.实现分表策略
就是完成上面strategyClass对应的分表策略实现类,其实只需要实现ShardStrategy接口并实现其中的getTargetTableName方法即可,比如:
public class DeviceShardStrategyImpl implements ShardStrategy { private final static int tableCount = 5; /** * 得到实际表名 * @param baseTableName 逻辑表名,一般是没有前缀或者是后缀的表名 * @param params mybatis执行某个statement时使用的参数 * @param mapperId mybatis配置的statement id * @return */ @Override public String getTargetTableName(String baseTableName, Object params, String mapperId) { // TODO: 需要根据实际的参数或其他(比如当前时间)计算出一个满足要求的值 int value = 2; try { int index = value % tableCount + 1; String strIndex = "0" + index; return baseTableName + "_" + strIndex; } catch (Exception e) { throw new RuntimeException(e.getMessage(), e); } } }
实际中实现需要根据实际的参数或其他(比如当前时间)计算出一个满足要求的值,最后拼接成实际的表名就可以了。当然了,这个【满足要求的值】有时要计算起来会特别麻烦。这里呢,说一说我自己在实际项目中计算value的一个设计和实现。
实际业务讲解
假设有1000辆被客户使用的共享汽车,假设每辆车每天跑4个小时,每3S一条定位数据,那样一天下来定位数据在60*60*4*1000/3=480w这个量级,实际存储的数据在350w~450w之间,这些数据都需要插入到数据库中。我们知道对于一般的数据库而言,单表达到百万甚至千万级别时,任何操作即使是select count(1)也会变得很慢,这时分表是必须的。
具体说一下分表的策略:假设我们要把原来的大表拆分成512张小表,以设备为维度进行水平拆分,每次对表执行插入时,找到对应的设备(设备表t_device,设备和车辆是一对一) 的 id,使用设备id%512作为表后缀。
举个例子,车辆的定位数据存储的表为t_location_000 ~ t_location_511(注意不是1~512), 设备A在t_device表里的id为513,那么A对应的定位数据存储表为:513%512=1 -> t_location_001, 设备B在t_device表里的id为1128,那么B对应的分时数据存储表为:1128%512=104 -> t_location_104。相信这个不难理解,接下来的问题就是如何从 public String getTargetTableName(String baseTableName, Object params, String mapperId) 这个方法里取出我们说的t_device A和B了,根据代码上的解释,我们可以知道A和B要从 Object params 里解析出来。
注意,接下来是重点!!!
为了尽可能通用,我们自定义一个注解,@DeviceShard
@Retention(RetentionPolicy.RUNTIME) @Target({ElementType.PARAMETER}) public @interface DeviceShard { String value() default ""; }
分表策略使用的参数要求必须使用这个,比如:
int insertSelective(@DeviceShard String instrumentId, @Param("timeTrend")TimeTrend record);
接下来增加一个根据Object params解析标识有@DeviceShard注解的参数的实际值,直接给出代码:
public class DeviceShardValue { private DeviceShardValue() {} public static Object[] getShardValue(Object params, String mapperId) throws Exception { if(params != null && StringUtils.isNotBlank(mapperId)) { int lastPoint = mapperId.lastIndexOf("."); String clazzName = mapperId.substring(0,lastPoint); String methodName = mapperId.substring(lastPoint + 1); Class clazz = Class.forName(clazzName); Method[] methods = clazz.getMethods(); if (methods.length <= 0){ throw new Exception("class has no method!"); } List<Integer> shardFieldIndexes = new ArrayList<>(); List<Class> shardFieldTypes = new ArrayList<>(); List<SymbolShard> fieldAnnotations = new ArrayList<>(); for (Method method : methods) { if (methodName.equals(method.getName())) { Annotation[][] annotations = method.getParameterAnnotations(); if(annotations == null || annotations.length <= 0){ throw new Exception("method has no shard field"); } for (int i = 0; i < annotations.length; i++){ Annotation[] fieldAnno = annotations[i]; if (fieldAnno != null && fieldAnno.length > 0) { for (Annotation annotation:fieldAnno) { if (annotation.annotationType() == SymbolShard.class){ shardFieldIndexes.add(i); shardFieldTypes.add(method.getParameterTypes()[i]); fieldAnnotations.add((SymbolShard)annotation); } } } } } } if (shardFieldIndexes.size() <= 0){ throw new Exception("method has no shard field"); } Object[] values = new Object[shardFieldIndexes.size()]; for (int i = 0; i < shardFieldIndexes.size(); i++){ int shardFieldIndex = shardFieldIndexes.get(i); Class shardFieldType = shardFieldTypes.get(i); DeviceShard fieldAnnotation = fieldAnnotations.get(i); if (params.getClass() == shardFieldType) { values[i] = getFieldValue(fieldAnnotation,params); } else { String key = "param" + (shardFieldIndex+1); HashMap<String,Object> map = (HashMap<String,Object>) params; Object tmp = map.get(key); values[i] = getFieldValue(fieldAnnotation, tmp); } } return values; } return null; } private static Object getFieldValue(JmxShard fieldAnnotation,Object params) throws Exception { if(isBasicType(params)) { return params; } else { String shardFieldName = fieldAnnotation.value(); if(StringUtils.isBlank(shardFieldName)) { throw new Exception("the shardFieldName was not annotated"); } Field field = null; try { field = params.getClass().getDeclaredField(shardFieldName); } catch (NoSuchFieldException e){ field = params.getClass().getSuperclass().getDeclaredField(shardFieldName); } field.setAccessible(true); return field.get(params); } } private static boolean isBasicType(Object param){ if (param == null){ return false; } if (param instanceof String){ return true; } if (param instanceof BigDecimal){ return true; } if (param instanceof Integer){ return true; } if (param instanceof Long){ return true; } if (param instanceof Double){ return true; } if (param instanceof Float){ return true; } if (param instanceof Character){ return true; } if (param instanceof Byte){ return true; } if (param instanceof Short){ return true; } if (param instanceof Boolean){ return true; } return false; } }
接下来去实现ShardStrategy就很容易了(个别细节忽略):
public class TimeTrendShardStrategyImpl implements ShardStrategy{ private final static Integer tableCount = 512; @Override public String getTargetTableName(String baseTableName, Object params,String mapperId) { Object value; try { // 调用封装的工具类获取传入标识有@SymbolShard注解的参数的值 value = DeviceShardValue.getShardValue(params, mapperId)[0]; // 连接数据库,去symbols表查询,注意这里不是使用自动注入@Autowired和@Resource,这种方式在权限课程里介绍过 String device = value.toString(); Device deviceInstance = SpringContextHolder.getBean(DeviceMapper.class).selectByDevice(device); if(symbolInstance == null) { // 如果查不到对应的symbol实例, 则返回一个普通表,否则这里会抛一个上层不知道的异常 return baseTableName + "_000"; } // 根据id拼装实际的分表后的表名 Integer index = deviceInstance.getId() % tableCount; String strIndex = ""; if(index < 10) { strIndex = "00" + index; } else if(index < 100) { strIndex = "0" + index; } else { strIndex = "" + index; } return baseTableName + "_" + strIndex; } catch (Exception e) { throw new RuntimeException(e.getMessage(), e); } } }
写到这里,再提醒一下,别忘了把分表的方法和分表策略写入shard_config.xml
6.适用范围
还要注意这个插件的适用范围,我自己在上面踩过坑,就是在做数据库批量操作时使用这个插件会没有效果。
具体支持哪些sql呢,网上有人给了总结,我直接引用一下:
select * from test_table1 select * from test_table1 where col_1='123' select * from test_table1 where col_1='123' and col_2=8 select * from test_table1 where col_1=? select col_1,max(col_2) from test_table1 where col_4='t1' group by col_1 select col_1,col_2,col_3 from test_table1 where col_4='t1' order by col_1 select col_1,col_2,col_3 from test_table1 where id in (?,?,?,?,?,?,?,?,?) limit ?,? select a.* from test_table1 a,test_table2 b where a.id=b.id and a.type='xxxx' select a.col_1,a.col_2,a.col_3 from test_table1 a where a.id in (select aid from test_table2 where col_1=1 and col_2=?) order by id desc select col_1,col_2 from test_table1 where type is not null and col_3 is null order by id select count(*),col_1 from test_table2 group by col_1 having count(*)>1 select a.col_1,a.col_2,b.col_1 from test_table1 a,t_table b where a.id=b.id insert into test_table1 (col_1,col_2,col_3,col_4) values (?,?,?,?) SELECT EMPLOYEEIDNO FROM test_table1 WHERE POSITION = 'Manager' AND SALARY > 60000 OR BENEFITS > 12000 SELECT EMPLOYEEIDNO FROM test_table1 WHERE POSITION = 'Manager' AND (SALARY > 50000 OR BENEFIT > 10000) SELECT EMPLOYEEIDNO FROM test_table1 WHERE LASTNAME LIKE 'L%' SELECT DISTINCT SELLERID, OWNERLASTNAME, OWNERFIRSTNAME FROM test_table1, test_table2 WHERE SELLERID = OWNERID ORDER BY OWNERLASTNAME, OWNERFIRSTNAME, OWNERID SELECT OWNERFIRSTNAME, OWNERLASTNAME FROM test_table1 WHERE EXISTS (SELECT * FROM test_table2 WHERE ITEM = ?) SELECT BUYERID, ITEM FROM test_table1 WHERE PRICE >= ALL (SELECT PRICE FROM test_table2) SELECT BUYERID FROM test_table1 UNION SELECT BUYERID FROM test_table2 SELECT OWNERID, 'is in both Orders & Antiques' FROM test_table1 a, test_table2 b WHERE a.OWNERID = b.BUYERID and a.type in (?,?,?) SELECT DISTINCT SELLERID, OWNERLASTNAME, OWNERFIRSTNAME FROM test_table1, noconvert_table WHERE SELLERID = OWNERID ORDER BY OWNERLASTNAME, OWNERFIRSTNAME, OWNERID SELECT a.* FROM test_table1 a, noconvert_table b WHERE a.SELLERID = b.OWNERID update test_table1 set col_1=123 ,col_2=?,col_3=? where col_4=? update test_table1 set col_1=?,col_2=col_2+1 where id in (?,?,?,?) delete from test_table2 where id in (?,?,?,?,?,?) and col_1 is not null INSERT INTO test_table1 VALUES (21, 01, 'Ottoman', ?,?) INSERT INTO test_table1 (BUYERID, SELLERID, ITEM) VALUES (01, 21, ?)
可能有些sql语句没有出现在测试用例里,但是相信基本上常用的查询sql shardbatis解析都没有问题,因为shardbatis对sql的解析是基于jsqlparser的
另外需要注意的是:
- 2.0版本中insert update delete 语句中的子查询语句中的表不支持sharding
- select语句中如果进行多表关联,请务必为每个表名加上别名 例如原始sql语句:SELECT a. FROM ANTIQUES a,ANTIQUEOWNERS b, mytable c where a.id=b.id and b.id=c.id 经过转换后的结果可能为:SELECT a. FROM ANTIQUES_0 AS a, ANTIQUEOWNERS_1 AS b, mytable_1 AS c WHERE a.id = b.id AND b.id = c.id