zoukankan      html  css  js  c++  java
  • 【mysql】批量更新数据

    概述

    批量更新mysql数据表数据,上网搜索基本都会说4~5方法,本人使用的更新方式为:
    INSERT ... ON DUPLICATE KEY UPDATE Syntax
    可参见官方网站:insert-on-duplicate

    功能:

    • 表示插入时,如果遇到了主键重复唯一索引重复,则不执行插入操作,而是执行更新操作;

    注意点:

    • 这种方式的批量更新,不是sql的规范,而是mysql特有的;
    • 只能针对唯一索引(UNIQUE index) 主键索引(RIMARY KEY)进行更新;
    • 对于自增主键,只会执行插入操作,不会进行更新;
    • 批量更新:values()方法很有用;

    性能:

    • 对于数据量比较小的表,速度很快;
    • 对于数据量大的表,性能比较差,建议考虑其他方式;

    如果使用Innodb引擎,则可以考虑如下方式(因为Innodb引擎支持事务)

    START TRANSACTION;
    UPDATE ...
    UPDATE ...
    UPDATE ...
    UPDATE ...
    COMMIT;
    

    https://dba.stackexchange.com/questions/28282/whats-the-most-efficient-way-to-batch-update-queries-in-mysql

    values(col_name)介绍

    values(col_name):表示获取将要插入的列的值,注意是将要插入(would be inserted)


    原始表结构和数据

    CREATE TABLE `capacity_pm` (
      `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '自增主键',
      `pool_id` char(36) CHARACTER SET utf8 DEFAULT NULL COMMENT '资源池ID',
      `cluster_lv1` varchar(255) CHARACTER SET utf8 DEFAULT NULL COMMENT '集群分类',
      `cluster_lv2` varchar(255) CHARACTER SET utf8 DEFAULT NULL COMMENT '集群2级分类',
      `update_at` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '更新或创建时间',
      `templete_id` varchar(255) CHARACTER SET utf8 NOT NULL COMMENT '模板ID',
      `templete_name` varchar(255) CHARACTER SET utf8 DEFAULT NULL COMMENT '模板名称',
      `templete_cpu_core` int(10) unsigned zerofill NOT NULL COMMENT '模板CPU核数',
      `templete_mem_size` double NOT NULL COMMENT '模板内存大小',
      `templete_disk_size` double NOT NULL COMMENT '模板磁盘大小',
      `host_total` int(11) unsigned zerofill DEFAULT NULL COMMENT '主机总数',
      `host_used` int(11) unsigned zerofill DEFAULT NULL COMMENT '主机已分配数量',
      `cpu_core_total` int(11) unsigned zerofill DEFAULT NULL COMMENT 'cpu总核数',
      `cpu_core_free` int(11) DEFAULT NULL,
      `cpu_core_used` int(11) DEFAULT NULL COMMENT 'cpu已分配数量',
      `cpu_core_util` double DEFAULT NULL COMMENT 'cpu核数使用占比',
      `mem_total` double DEFAULT NULL COMMENT '内存总空间',
      `mem_free` double DEFAULT NULL,
      `mem_used` double DEFAULT NULL,
      `mem_util` double DEFAULT NULL COMMENT '内存使用占比',
      `disk_total` double DEFAULT NULL,
      `disk_free` double DEFAULT NULL,
      `disk_used` double DEFAULT NULL,
      `disk_util` double DEFAULT NULL COMMENT '磁盘使用占比',
      PRIMARY KEY (`id`),
      UNIQUE KEY `idx_templete_all` (`pool_id`,`templete_id`) USING BTREE COMMENT '模块ID做完整索引'
    ) ENGINE=InnoDB AUTO_INCREMENT=70 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
    
    
    INSERT INTO `capacity_pm` VALUES ('1', '7b8f0f5e2fbb4d9aa2d5fd55466d638f', null, null, '2018-04-11 15:04:31', 't001', '数据库服务器', '0000000000', '0', '0', '00000000100', '00000000010', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('2', '7b8f0f5e2fbb4d9aa2d5fd55466d638f', null, null, '2018-04-11 15:04:31', 't002', '性能性服务器', '0000000000', '0', '0', '00000000200', '00000000020', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('3', '7b8f0f5e2fbb4d9aa2d5fd55466d638f', null, null, '2018-04-11 15:04:31', 't003', '计算型服务器', '0000000000', '0', '0', '00000000300', '00000000030', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('4', '7b8f0f5e2fbb4d9aa2d5fd55466d638f', null, null, '2018-04-11 15:04:31', 't004', '存储型服务器', '0000000000', '0', '0', '00000000400', '00000000040', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('5', '7b8f0f5e2fbb4d9aa2d5fd55466d638f', null, null, '2018-04-11 15:04:31', 't005', '网络型服务器', '0000000000', '0', '0', '00000000500', '00000000050', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('6', '7b8f0f5e2fbb4d9aa2d5fd55466d638e', null, null, '2018-04-11 15:04:31', 't001', '数据库服务器', '0000000000', '0', '0', '00000001000', '00000000100', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('7', '7b8f0f5e2fbb4d9aa2d5fd55466d638e', null, null, '2018-04-11 15:04:31', 't002', '性能性服务器', '0000000000', '0', '0', '00000002000', '00000000200', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('8', '7b8f0f5e2fbb4d9aa2d5fd55466d638e', null, null, '2018-04-11 15:04:31', 't003', '计算型服务器', '0000000000', '0', '0', '00000003000', '00000000300', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('9', '7b8f0f5e2fbb4d9aa2d5fd55466d638e', null, null, '2018-04-11 15:04:31', 't004', '存储型服务器', '0000000000', '0', '0', '00000004000', '00000000400', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('10', '7b8f0f5e2fbb4d9aa2d5fd55466d638e', null, null, '2018-04-11 15:04:31', 't005', '网络型服务器', '0000000000', '0', '0', '00000005000', '00000000500', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('12', '7b8f0f5e2fbb4d9aa2d5fd55466d638e', null, null, '2018-04-11 08:12:00', 't006', '自定义服务器', '0000000000', '0', '0', '00000006000', '00000000600', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('13', '7b8f0f5e2fbb4d9aa2d5fd55466d638e', null, null, '2018-04-11 08:12:36', 't007', 'xxx服务器', '0000000000', '0', '0', '00000007000', '00000000700', null, null, null, null, null, null, null, null, null, null, null, null);
    INSERT INTO `capacity_pm` VALUES ('14', '7b8f0f5e2fbb4d9aa2d5fd55466d638e', null, null, '2018-04-11 08:12:36', 't00x', '服务器xxx', '0000000000', '0', '0', '00000008000', '00000000800', null, null, null, null, null, null, null, null, null, null, null, null);
    
    

    部分数据集查询如下:(该部分为重点测试的数据)

    mysql> SELECT pool_id, templete_id, host_total, host_used from capacity_pm ;
    +----------------------------------+-------------+------------+-----------+
    | pool_id                          | templete_id | host_total | host_used |
    +----------------------------------+-------------+------------+-----------+
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t001        |        100 |        10 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t002        |        200 |        20 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t003        |        300 |        30 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t004        |        400 |        40 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t005        |        500 |        50 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t001        |       1000 |       100 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t002        |       2000 |       200 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t003        |       3000 |       300 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t004        |       4000 |       400 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t005        |       5000 |       500 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t006        |       6000 |       600 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t007        |       7000 |       700 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t00x        |       8000 |       800 |
    +----------------------------------+-------------+------------+-----------+
    

    插入新数据-测试

    • 自增主键:上表中,主键是自增主键,所以该种批量更新方式对自增主键无效(因为自增主键只会insert数据,并不会update);
    • 唯一索引:UNIQUE KEY idx_templete_all (pool_id,templete_id)

    待插入的数据,和表中的初始数据有唯一索引重复,索引会执行update操作,而非insert操作;

    插入语句为:(所有的host_used都有变化)

    INSERT INTO `capacity_pm`(pool_id, templete_id, host_total, host_used) values
    ('7b8f0f5e2fbb4d9aa2d5fd55466d638f', 't001','100', '15'),
    ('7b8f0f5e2fbb4d9aa2d5fd55466d638f', 't002','200', '25'),
    ('7b8f0f5e2fbb4d9aa2d5fd55466d638f', 't003','300', '35'),
    ('7b8f0f5e2fbb4d9aa2d5fd55466d638f', 't004','400', '45'),
    ('7b8f0f5e2fbb4d9aa2d5fd55466d638f', 't005','500', '55') 
    ON DUPLICATE KEY UPDATE host_total=VALUES(host_total), host_used=VALUES(host_used);
    

    ON DUPLICATE KEY UPDATE host_total=VALUES(host_total), host_used=VALUES(host_used)

    • host_total=VALUES(host_total): values(col_name)表示待插入的记录的值;
    • host_used=VALUES(host_used):当需要更新多个col时,使用“,”分割;

    插入结果: host_used都发生了变化

    mysql> SELECT pool_id, templete_id, host_total, host_used from capacity_pm ;
    +----------------------------------+-------------+------------+-----------+
    | pool_id                          | templete_id | host_total | host_used |
    +----------------------------------+-------------+------------+-----------+
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t001        |        100 |        15 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t002        |        200 |        25 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t003        |        300 |        35 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t004        |        400 |        45 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t005        |        500 |        55 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t001        |       1000 |       100 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t002        |       2000 |       200 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t003        |       3000 |       300 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t004        |       4000 |       400 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t005        |       5000 |       500 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t006        |       6000 |       600 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t007        |       7000 |       700 |
    | 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t00x        |       8000 |       800 |
    +----------------------------------+-------------+------------+-----------+
    13 rows in set
    

    其他注意点

    唯一索引:ALL唯一索引字段都不能为空,否则无法达到update操作;


    性能比较

    批量更新5w条数据

    INSERT ... ON DUPLICATE KEY UPDATE Syntax

    自己的机器上运行,大约30s;

    事务批量更新

    START TRANSACTION;
    UPDATE ...
    UPDATE ...
    UPDATE ...
    UPDATE ...
    COMMIT;
    

    测试结果:耗时特别长,不知道具体原因

  • 相关阅读:
    Oracle 归档模式
    如果在安装32位Oracle客户端组件的情况下64位模式运行, 将出现此问题.
    ORA-00972: 标识符过长
    Oracle SQL%ROWCOUNT
    ASP.NET Core 中间件的几种实现方式
    Python 闭包
    Python 迭代器
    Python 正则表达式提高
    Python正则表达式
    Python 生成器
  • 原文地址:https://www.cnblogs.com/ssslinppp/p/8805471.html
Copyright © 2011-2022 走看看