zoukankan      html  css  js  c++  java
  • MySQL修改有存量数据的数据库字符集实战

    原文链接:https://www.modb.pro/db/22722?cyn

    我们已经知道数据库或者表,可以在创建后再修改字符集。但是修改字符集不会将已有的数据按新字符集重新进行处理。那么已有存量数据的数据库改如何调整字符集呢?可以使用命令进行转换,也可以像早期先将数据导出,调整字符集,再将数据导入。

    我们假设有个业务,在英语国家诞生,当时其研发团队创建一个latin1字符集的数据库进行支撑。

    root@database-one 13:25:  [(none)]> create database DiscountStore default charset latin1;
    Query OK, 1 row affected (0.01 sec)
    
    root@database-one 13:33:  [(none)]> use discountstore;
    Database changed
    root@database-one 13:33:  [discountstore]> create table orders (no int,Buyer varchar(30),Seller varchar(30),InstallationDate datetime) engine=innodb charset latin1;
    Query OK, 0 rows affected (0.02 sec)
    
    root@database-one 13:41:  [discountstore]> insert into orders values(666,'Steve','Tom',now()+2);
    Query OK, 1 row affected (0.00 sec)
    
    root@database-one 13:41:  [discountstore]> insert into orders values(777,'Jeff','Bill',now()+3);
    Query OK, 1 row affected (0.00 sec)
    
    root@database-one 13:42:  [discountstore]> select * from orders;
    +------+-------+--------+---------------------+
    | no   | Buyer | Seller | InstallationDate    |
    +------+-------+--------+---------------------+
    |  666 | Steve | Tom    | 2020-03-18 13:41:26 |
    |  777 | Jeff  | Bill   | 2020-03-18 13:42:28 |
    +------+-------+--------+---------------------+
    2 rows in set (0.00 sec)

    业务开展的很好,现在要拓展到中国,系统需要能够支持中文,同时还得考虑继续扩展到其它国家的可能,所以研发团队选utf8做为数据库新的字符集,因为utf8兼容latin1,所以只需修改数据库、表的默认字符集,转换列的字符集即可,不需要其它额外处理。

    root@database-one 14:15:  [discountstore]> alter database discountstore character set utf8;
    Query OK, 1 row affected (0.00 sec)
    
    root@database-one 14:16:  [discountstore]> show create database discountstore G
    *************************** 1. row ***************************
           Database: discountstore
    Create Database: CREATE DATABASE `discountstore` /*!40100 DEFAULT CHARACTER SET utf8 */
    1 row in set (0.00 sec)
    
    root@database-one 14:16:  [discountstore]> alter table orders character set utf8;
    Query OK, 0 rows affected (0.00 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    root@database-one 14:20:  [discountstore]> show create table orders G
    *************************** 1. row ***************************
           Table: orders
    Create Table: CREATE TABLE `orders` (
      `no` int(11) DEFAULT NULL,
      `Buyer` varchar(30) CHARACTER SET latin1 DEFAULT NULL,
      `Seller` varchar(30) CHARACTER SET latin1 DEFAULT NULL,
      `InstallationDate` datetime DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    1 row in set (0.00 sec)

    可以看到,库和表的字符集都变过来了,但是列的字符集还是latin1,对其也进行转换。

    root@database-one 14:25:  [discountstore]> alter table orders convert to character set utf8;
    Query OK, 2 rows affected (0.04 sec)
    Records: 2  Duplicates: 0  Warnings: 0
    
    root@database-one 14:28:  [discountstore]> show create table orders G
    *************************** 1. row ***************************
           Table: orders
    Create Table: CREATE TABLE `orders` (
      `no` int(11) DEFAULT NULL,
      `Buyer` varchar(30) DEFAULT NULL,
      `Seller` varchar(30) DEFAULT NULL,
      `InstallationDate` datetime DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    1 row in set (0.00 sec)
    
    root@database-one 14:32:  [discountstore]> select * from orders;
    +------+-------+--------+---------------------+
    | no   | Buyer | Seller | InstallationDate    |
    +------+-------+--------+---------------------+
    |  666 | Steve | Tom    | 2020-03-18 13:41:26 |
    |  777 | Jeff  | Bill   | 2020-03-18 13:42:28 |
    +------+-------+--------+---------------------+
    2 rows in set (0.00 sec)
    
    root@database-one 14:32:  [discountstore]> insert into orders values(888,'肖杰','郭伟',now()+4);
    Query OK, 1 row affected (0.00 sec)
    
    root@database-one 14:32:  [discountstore]> select * from orders;
    +------+--------+--------+---------------------+
    | no   | Buyer  | Seller | InstallationDate    |
    +------+--------+--------+---------------------+
    |  666 | Steve  | Tom    | 2020-03-18 13:41:26 |
    |  777 | Jeff   | Bill   | 2020-03-18 13:42:28 |
    |  888 | 肖杰   | 郭伟   | 2020-03-18 14:32:40 |
    +------+--------+--------+---------------------+
    3 rows in set (0.00 sec)

    可以看到,转换完毕后,原数据正常,新的中文数据也可以存入了。

    在上面的例子中,字符集从latin1转换为utf8,目标字符集是原字符集的超集,完全兼容,所以才能用命令转换。如果反过来,业务收缩,要从utf8将回latin1,就要先进行数据清洗,然后再修改和转换字符集,否则会因存量数据导致转换失败。

    root@database-one 14:43:  [discountstore]> alter database discountstore character set latin1;
    Query OK, 1 row affected (0.00 sec)
    
    root@database-one 14:43:  [discountstore]> show create database discountstore G
    *************************** 1. row ***************************
           Database: discountstore
    Create Database: CREATE DATABASE `discountstore` /*!40100 DEFAULT CHARACTER SET latin1 */
    1 row in set (0.00 sec)
    
    root@database-one 14:44:  [discountstore]> alter table orders character set latin1;
    Query OK, 0 rows affected (0.01 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    root@database-one 14:44:  [discountstore]> show create table orders G
    *************************** 1. row ***************************
           Table: orders
    Create Table: CREATE TABLE `orders` (
      `no` int(11) DEFAULT NULL,
      `Buyer` varchar(30) CHARACTER SET utf8 DEFAULT NULL,
      `Seller` varchar(30) CHARACTER SET utf8 DEFAULT NULL,
      `InstallationDate` datetime DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    1 row in set (0.00 sec)
    
    root@database-one 14:44:  [discountstore]> select * from orders;
    +------+--------+--------+---------------------+
    | no   | Buyer  | Seller | InstallationDate    |
    +------+--------+--------+---------------------+
    |  666 | Steve  | Tom    | 2020-03-18 13:41:26 |
    |  777 | Jeff   | Bill   | 2020-03-18 13:42:28 |
    |  888 | 肖杰   | 郭伟   | 2020-03-18 14:32:40 |
    +------+--------+--------+---------------------+
    3 rows in set (0.00 sec)
    
    root@database-one 14:46:  [discountstore]> alter table orders convert to character set latin1;
    ERROR 1366 (HY000): Incorrect string value: 'xE8x82x96xE6x9DxB0' for column 'Buyer' at row 3
    root@database-one 14:46:  [discountstore]> delete from orders where no=888;
    Query OK, 1 row affected (0.00 sec)
    
    root@database-one 14:47:  [discountstore]> alter table orders convert to character set latin1;
    Query OK, 2 rows affected (0.24 sec)
    Records: 2  Duplicates: 0  Warnings: 0
    
    root@database-one 14:47:  [discountstore]> select * from orders;
    +------+-------+--------+---------------------+
    | no   | Buyer | Seller | InstallationDate    |
    +------+-------+--------+---------------------+
    |  666 | Steve | Tom    | 2020-03-18 13:41:26 |
    |  777 | Jeff  | Bill   | 2020-03-18 13:42:28 |
    +------+-------+--------+---------------------+
    2 rows in set (0.00 sec)
    
    root@database-one 14:47:  [discountstore]> show create table orders G
    *************************** 1. row ***************************
           Table: orders
    Create Table: CREATE TABLE `orders` (
      `no` int(11) DEFAULT NULL,
      `Buyer` varchar(30) DEFAULT NULL,
      `Seller` varchar(30) DEFAULT NULL,
      `InstallationDate` datetime DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    1 row in set (0.00 sec)

    先将数据导出,调整字符集,再将数据导入的方法我们也测试一下。

    (可跳转至https://www.modb.pro/db/22722?cyn查看)

    特别注意:
    选择目标字符集的时候,最好是源字符集的超集,否则目标字符集中不支持的字符会变成乱码。

  • 相关阅读:
    Vs 开发时无法断点问题
    VS启动调试速度异常的缓慢问题
    vs2017 调试时 浏览器关闭不想中断调试
    聚簇索引和非聚簇索引
    java实现阿里云短信服务发送验证码
    mysql定时器
    token,加密,签名
    Redis更新缓存同步数据库的理解
    Token
    解决哈希冲突的方法
  • 原文地址:https://www.cnblogs.com/hzcya1995/p/13311774.html
Copyright © 2011-2022 走看看