zoukankan      html  css  js  c++  java
  • MYSQL数据去重与外表填充

    经常要对数据库中的数据进行去重,有时还需要使用外部表填冲数据,本文档记录数据去重与外表填充数据。
    date:2016/8/17
    author:wangxl

    1 需求

    对user_info1表去重,并添加age项。

    2 表数据

    user_info1:
    +----+----------+------+------+
    | id | name     | sex  | age  |
    +----+----------+------+------+
    |  1 | xiaolong | 1    | NULL |
    |  2 | xiaoyun  | 1    | NULL |
    |  3 | xiaoqin  | 2    | NULL |
    |  4 | xiaolong | 1    | NULL |
    |  5 | xiaodong | 1    | NULL |
    |  6 | xiaokai  | 1    | NULL |
    |  7 | xiaohong | 2    | NULL |
    |  8 | xiaolong | 1    | NULL |
    |  9 | xiaohong | 2    | NULL |
    | 10 | xiaofen  | 2    | NULL |
    +----+----------+------+------+
    
    user_info2:
    +----------+------+
    | name     | age  |
    +----------+------+
    | xiaolong |   26 |
    | xiaoyun  |   28 |
    | xiaoqin  |   27 |
    | xiaodong |   27 |
    | xiaokai  |   27 |
    | xiaohong |   24 |
    | xiaofen  |   22 |
    +----------+------+
    

    3 实战

    3.1 去重

    (1) 找出有重复字段
    	select * from user_info1 where name in (select name from user_info1 group by name having count(name) > 1);
    (2) 找出要删除的记录,重复记录是根据单个字段(name)来判断,只留有id最小的记录
    	select * from user_info1 where name in (select name from user_info1 group by name having count(name) > 1) and id not in (select min(id) from user_info1 group by name having count(name) > 1);
    (3) 删除表中多余的重复记录
    	delete from user_info1 where name in (select name from user_info1 group by name having count(name) > 1) and id not in (select min(id) from user_info1 group by name having count(name) > 1);
    
    	报错:ERROR 1093 (HY000): You can't specify target table 'user_info1' for update in FROM clause
    

    更换思路:找出每组中非最小id并删除,如下:

    (4) 找出每组最小id
    	select min(id) from user_info1 group by name
    (5) 找出每组非最小id
    	select * from user_info1 where id not in (select min(id) from user_info1 group by name);
    (6) 删除每组中非最小id所在行
    	delete from user_info1 where id not in (select id from select min(id) from user_info1 group by name);
    	ERROR 1093 (HY000): You can't specify target table 'user_info1' for update in FROM clause
    	更正:
    	delete from user_info1 where id not in (select minid from (select min(id) as minid from user_info1 group by name) a);、
    
    结果展示:
    +----+----------+------+------+
    | id | name     | sex  | age  |
    +----+----------+------+------+
    | 1  | xiaolong | 1    | NULL |
    | 2  | xiaoyun  | 1    | NULL |
    | 3  | xiaoqin  | 2    | NULL |
    | 5  | xiaodong | 1    | NULL |
    | 6  | xiaokai  | 1    | NULL |
    | 7  | xiaohong | 2    | NULL |
    | 10 | xiaofen  | 2    | NULL |
    +----+----------+------+------+
    

    对于没有primary key的话,怎么去重呢?

    (7) 创建表test
    (8) insert into test select distinct(name),sex,age from user_info1 group by name; 
    
    暂时没想出一句话解决方案.
    

    3.2 外表插入

    update user_info1 t set age=(select age from user_info2 where name=t.name);
    结果如下:
    +----+----------+------+------+
    | id | name     | sex  | age  |
    +----+----------+------+------+
    | 1  | xiaolong | 1    |   26 |
    | 2  | xiaoyun  | 1    |   28 |
    | 3  | xiaoqin  | 2    |   27 |
    | 5  | xiaodong | 1    |   27 |
    | 6  | xiaokai  | 1    |   27 |
    | 7  | xiaohong | 2    |   24 |
    | 10 | xiaofen  | 2    |   22 |
    +----+----------+------+------+
  • 相关阅读:
    vue+sass 下sass不能运行问题
    Servlet3.0的新特性
    Servlet监听器
    Servlet过滤器
    会话跟踪
    Servlet的应用
    Servlet概述
    《南怀瑾讲述99个人生道理》——刘清海编著
    Web应用程序简介
    JDBC学习总结(五)
  • 原文地址:https://www.cnblogs.com/riordon/p/5782286.html
Copyright © 2011-2022 走看看