zoukankan      html  css  js  c++  java
  • MYSQL中删除重复记录的方法

    在实际应用中,很可能会碰到一些需要删除某些字段的重复记录,我现在把我能想到的写下来,望高手们补充。

    1、
    具体实现如下:

    Table         Create Table                                           
    ------------  --------------------------------------------------------
    users_groups  CREATE TABLE `users_groups` (                          
                    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,       
                    `uid` int(11) NOT NULL,                              
                    `gid` int(11) NOT NULL,                              
                    PRIMARY KEY (`id`)                                   
                  ) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8 

    users_groups.txt内容:
    1,11,502
    2,107,502
    3,100,503
    4,110,501
    5,112,501
    6,104,502
    7,100,502
    8,100,501
    9,102,501
    10,104,502
    11,100,502
    12,100,501
    13,102,501
    14,110,501

    mysql> load data infile 'c://users_groups.txt' into table users_groups fields
    rminated by ',' lines terminated by '/n';
    Query OK, 14 rows affected (0.05 sec)
    Records: 14  Deleted: 0  Skipped: 0  Warnings: 0

    mysql> select * from users_groups;

    query result(14 records)

    id uid gid
    1 11 502
    2 107 502
    3 100 503
    4 110 501
    5 112 501
    6 104 502
    7 100 502
    8 100 501
    9 102 501
    10 104 502
    11 100 502
    12 100 501
    13 102 501
    14 110 501
    14 rows in set (0.00 sec)

    mysql> create temporary table tmp_wrap select * from users_groups group by uid having count(1) > 1 union all
    select * from users_groups group by uid having count(1) = 1;
    Query OK, 7 rows affected (0.11 sec)
    Records: 7  Duplicates: 0  Warnings: 0

    mysql> truncate table users_groups;
    Query OK, 14 rows affected (0.03 sec)

    mysql> insert into users_groups select * from tmp_wrap;
    Query OK, 7 rows affected (0.03 sec)
    Records: 7  Duplicates: 0  Warnings: 0

    mysql> select * from users_groups;

    query result(7 records)

    id uid gid
    1 11 502
    2 107 502
    3 100 503
    4 110 501
    5 112 501
    6 104 502
    9 102 501


    mysql> drop table tmp_wrap;
    Query OK, 0 rows affected (0.05 sec)

    2、还有一个很精简的办法。

    查找重复的,并且除掉最小的那个。

    delete users_groups as a from users_groups as a,
    (
    select *,min(id) from users_groups group by uid having count(1) > 1
    ) as b
     where a.uid = b.uid and a.id > b.id;

    (7 row(s)affected)
    (0 ms taken)

    query result(7 records)

    id uid gid
    1 11 502
    2 107 502
    3 100 503
    4 110 501
    5 112 501
    6 104 502
    9 102 501

    3、现在来看一下这两个办法的效率。

    运行一下以下SQL 语句

    create index f_uid on users_groups(uid);
    explain select * from users_groups group by uid having count(1) > 1 union all
    select * from users_groups group by uid having count(1) = 1;

    explain select * from  users_groups as a,
    (
    select *,min(id) from users_groups group by uid having count(1) > 1
    ) as b
     where a.uid = b.uid and a.id > b.id;

    query result(3 records)

    id select_type table type possible_keys key key_len ref rows Extra
    1 PRIMARY users_groups index (NULL) f_uid 4 (NULL) 14  
    2 UNION users_groups index (NULL) f_uid 4 (NULL) 14  
    (NULL) UNION RESULT <union1,2> ALL (NULL) (NULL) (NULL) (NULL) (NULL)  

    query result(3 records)

    id select_type table type possible_keys key key_len ref rows Extra
    1 PRIMARY <derived2> ALL (NULL) (NULL) (NULL) (NULL) 4  
    1 PRIMARY a ref PRIMARY,f_uid f_uid 4 b.uid 1 Using where
    2 DERIVED users_groups index (NULL) f_uid 4 (NULL) 14  

    很明显的第二个比第一个扫描的函数要少。

     
  • 相关阅读:
    Assembly Manifest 通俗简易手册
    CruiseControl服务器安装配置
    关于URL编码
    从A到Z来说说Web开发
    通过注册表查看 .NET Framework的版本信息
    云数据存在哪里?
    C#中你可能不知道的8件事(zz)
    用PBKDF2 或BCrypt 来存储密码
    C++编译器什么时候为我们自动生成拷贝构造函数?
    C#中你可能不知道的8件事(zz)
  • 原文地址:https://www.cnblogs.com/secbook/p/2655305.html
Copyright © 2011-2022 走看看