除去重复行-Mozer-ChinaUnix博客

Chinaunix首页 | 论坛 | 博客

Mozer--沙落满地

首页　| 　博文目录　| 　关于我

博客访问： 1531998
博文数量： 297
博客积分： 10010
博客等级：上将
技术积分： 3082
用户组：普通用户
注册时间： 2007-02-07 11:36

文章分类

全部博文（297）

Virtual（1）

Vmware（1）
DATABASE（1）

ORACLE（1）
Job Course（1）
Windows（1）
Programming（16）

C++（1）

Python（7）

TCL（5）

Perl socket（1）

Perl（1）

securty（1）
Essay（2）
HardWare（2）

HardDisk（0）

Audio（0）

VGA（1）

CPU（1）
Linux（267）

JBoss（1）

LDAP（0）

SVN（1）

FTP（2）

bind（1）

Linux-HA（7）

yum（2）

CVS（5）

LVS（6）

Server（0）

Snort（1）

SecurityTools（0）

nmap（1）

awk（0）

文本处理（1）

xml（1）

经验总结（1）

protection（1）

pretend（3）

Attack（5）

Iptables（36）

SElinux（2）

OpenVPN（0）

Sercurity（0）

Mysql（0）

Apache（1）

php+mysql（5）

Unite-Study（0）

SSH（3）

Optimization（1）

sed（0）

vim（1）

FS（0）

File SYStem（0）

individuation（2）

RPMs（2）

小技巧（3）

ftp（2）

例子（5）

linux 审计（3）

合格Linux 管理员（10）

Shell 实战（0）

Expect（3）

study note（3）

experience（1）

tools（8）

example（1）

基本概念（8）

shell编程（18）

命令详解（37）

Tools（3）

skill（3）

System Integrati（6）

LDAP应用及实例（1）

EthNet App（0）

SecurExam（1）

Study Home（8）

kernel（2）

RAID（1）

Shell 片段（4）

Job（2）

Regular Exdivssi（5）

Operations（3）

Basic（0）

Skill（26）

LVM（2）

Squid（4）

Mail（1）
未分配的博文（6）

文章存档

2011年（1）

2009年（45）

2008年（67）

2007年（184）

我的朋友

最近访客

推荐博文

相关博文

除去重复行

分类： LINUX

2008-07-23 14:21:44

要统计各种数据文件，若干记录是否在出现在，大日志文件里，或是jcl，统计某个记录的条数，连接shell,处理为原始的文本数据（从数据库来），操控数据库，shell调用sqlplus,执行sql,perl DBI连接oracle,自动建立目录，消除重复行,排序，等等，用awk,shell,sed,grep,perl乱七八糟的。

发现Perl单独就可以把上面的工作基本全都做了，只要你不嫌麻烦代码。Perl真的挺好玩了，特别是用Perl写的相对比较复杂的数据结构，还有OO

的东西。

统计数据，要把一个文件里重复的记录删除，看了一眼网上给的答案，大体上就是，排序，之后用uniq,或awk

awk '{if ($0!=line) print;line=$0}' file

一位达人用sed写的版本，如下：

sed -f rmdup.sed yourfile

here is the rmdup.sed sed script:

#n rmdup.sed - ReMove DUPlicate consecutive lines

# read next line into pattern space (if not the last line)

$!N

# check if pattern space consists of two identical lines

s/^\(.*\)\n\1$/&/

# if yes, goto label RmLn, which will remove the first line in pattern space

t RmLn

# if not, print the first line (and remove it)

P

# garbage handling which simply deletes the first line in the pattern space

: RmLn

D

use `sort' first. there is no EFFICIENT way of sorting in sed/awk

阅读(895) | 评论(0) | 转发(0) |

0

上一篇：Linux/unix执行top命令详解

下一篇：uniq 去重

给主人留下些什么吧！~~

关于我们 | 关于IT168 | 联系方式 | 广告合作 | 法律声明 | 免费注册

Copyright 2001-2010 ChinaUnix.net All Rights Reserved 北京皓辰网域网络信息技术有限公司. 版权所有

感谢所有关心和支持过ChinaUnix的朋友们