[原创]MYSQL中删除重复记录的方法-yueliangdao0608-ChinaUnix博客

四爷yueliangdao0608.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

yueliangdao0608

博客访问： 4205409
博文数量： 240
博客积分： 11504
博客等级：上将
技术积分： 4277
用户组：普通用户
注册时间： 2006-12-28 14:24

文章分类

全部博文（240）

redis（1）
MySQL5.7（1）
python（2）
TokuDB（0）
PostgreSQL（14）
ORACLE（2）

与其他数据库的对（0）

架构相关（0）

日常操作（0）
Infobright（4）
其他数据库（17）
English（4）
WEB开发（31）
其他语言（1）
MySQL（118）

MySQL 5.6（3）

Document transla（0）

数组专题（0）

常用工具（10）

安全性（3）

SQL语句与特殊技（54）

性能优化（7）

高可用性（7）

集群（16）

同步（5）

备份与恢复（1）

安装与配置（7）
Windows（0）
Linux（5）
生活随笔（0）
C/C++（1）
未分配的博文（39）

相关博文

[原创]MYSQL中删除重复记录的方法

分类： Mysql/postgreSQL

2007-09-06 10:24:15

在实际应用中，很可能会碰到一些需要删除某些字段的重复记录，我现在把我能想到的写下来，望高手们补充。

1、
具体实现如下：
Table         Create Table
------------ --------------------------------------------------------
users_groups CREATE TABLE `users_groups` (
                `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
                `uid` int(11) NOT NULL,
                `gid` int(11) NOT NULL,
                PRIMARY KEY (`id`)
              ) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8

users_groups.txt内容：
1,11,502
2,107,502
3,100,503
4,110,501
5,112,501
6,104,502
7,100,502
8,100,501
9,102,501
10,104,502
11,100,502
12,100,501
13,102,501
14,110,501
mysql> load data infile 'c:\\users_groups.txt' into table users_groups fields
terminated by ',' lines terminated by '\n';
Query OK, 14 rows affected (0.05 sec)
Records: 14 Deleted: 0 Skipped: 0 Warnings: 0

mysql> select * from users_groups;

query result(14 records)

id	uid	gid
1	11	502
2	107	502
3	100	503
4	110	501
5	112	501
6	104	502
7	100	502
8	100	501
9	102	501
10	104	502
11	100	502
12	100	501
13	102	501
14	110	501

14 rows in set (0.00 sec)
根据一位兄弟的建议修改。
mysql> create temporary table tmp_wrap select * from users_groups group by uid having count(1) >= 1;
Query OK, 7 rows affected (0.11 sec)
Records: 7 Duplicates: 0 Warnings: 0

mysql> truncate table users_groups;
Query OK, 14 rows affected (0.03 sec)

mysql> insert into users_groups select * from tmp_wrap;
Query OK, 7 rows affected (0.03 sec)
Records: 7 Duplicates: 0 Warnings: 0

mysql> select * from users_groups;

query result(7 records)

id	uid	gid
1	11	502
2	107	502
3	100	503
4	110	501
5	112	501
6	104	502
9	102	501

mysql> drop table tmp_wrap;
Query OK, 0 rows affected (0.05 sec)

2、还有一个很精简的办法。

查找重复的，并且除掉最小的那个。

delete users_groups as a from users_groups as a,
(
select *,min(id) from users_groups group by uid having count(1) > 1
) as b
where a.uid = b.uid and a.id > b.id;

(7 row(s)affected)
(0 ms taken)

query result(7 records)

id	uid	gid
1	11	502
2	107	502
3	100	503
4	110	501
5	112	501
6	104	502
9	102	501

3、现在来看一下这两个办法的效率。

运行一下以下SQL 语句

create index f_uid on users_groups(uid);
explain select * from users_groups group by uid having count(1) > 1 union all
select * from users_groups group by uid having count(1) = 1;

explain select * from users_groups as a,
(
select *,min(id) from users_groups group by uid having count(1) > 1
) as b
where a.uid = b.uid and a.id > b.id;

query result(3 records)

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	users_groups	index	(NULL)	f_uid	4	(NULL)	14
2	UNION	users_groups	index	(NULL)	f_uid	4	(NULL)	14
(NULL)	UNION RESULT		ALL	(NULL)	(NULL)	(NULL)	(NULL)	(NULL)

query result(3 records)

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY		ALL	(NULL)	(NULL)	(NULL)	(NULL)	4
1	PRIMARY	a	ref	PRIMARY,f_uid	f_uid	4	b.uid	1	Using where
2	DERIVED	users_groups	index	(NULL)	f_uid	4	(NULL)	14

很明显的第二个比第一个扫描的函数要少。

阅读(3626) | 评论(2) | 转发(0) |

上一篇：[原创]MYSQL ROOT权限丢失的解决方法

下一篇：[原创]MYSQL 不允许在子查询的同时删除原表数据的解决方法

给主人留下些什么吧！~~

chinaunix网友2008-09-13 13:55:38

新建一个临时表 create table tmp as select * from youtable group by name 删除原来的表 drop table youtable 重命名表 alter table tmp rename youtable

回复 | 举报

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6

id	uid	gid
1	11	502
2	107	502
3	100	503
4	110	501
5	112	501
6	104	502
7	100	502
8	100	501
9	102	501
10	104	502
11	100	502
12	100	501
13	102	501
14	110	501

id	uid	gid
1	11	502
2	107	502
3	100	503
4	110	501
5	112	501
6	104	502
7	100	502
8	100	501
9	102	501
10	104	502
11	100	502
12	100	501
13	102	501
14	110	501

id	uid	gid
1	11	502
2	107	502
3	100	503
4	110	501
5	112	501
6	104	502
7	100	502
8	100	501
9	102	501
10	104	502
11	100	502
12	100	501
13	102	501
14	110	501