Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1405829
  • 博文数量: 247
  • 博客积分: 10147
  • 博客等级: 上将
  • 技术积分: 2776
  • 用 户 组: 普通用户
  • 注册时间: 2008-01-24 15:18
文章分类

全部博文(247)

文章存档

2013年(11)

2012年(3)

2011年(20)

2010年(35)

2009年(91)

2008年(87)

我的朋友

分类: LINUX

2010-03-24 09:26:30

 
看了各个linux 论坛的帖子,感觉sed的介绍不少,但有点零乱,在这里整理一下,希望能对学习者有所帮助!
注:sed使用中除了多个命令时, -e选项一般可以省略!
基础:
正则表达式(Regular Expression)
在学习sed前,首先了解RE的基本知识,大体上最基本也需要知道下面这些,如果不了解正则表达式,那么您将很难进阶
引用:
- 錨點(anchor)
用以標識 RE 於句子中的位置所在. 常見有:
^: 表示句首. 如 ^abc 表示以 abc 開首的句子.
$: 表示句尾. 如 abc$ 表示以 abc 結尾的句子.
\<: 表示詞首. 如 \\>: 表示詞尾. 如 abc\> 表示以 abc 結尾的詞.

- 修飾字符(modifier)
獨立表示時本身不具意義, 專門用以修改前一個 char. set 的出現次數. 常見有:
*: 表示前一個 char. set 的出現次數為 0 或多次. 如 ab*c 表示 a 與 c 之間可有 0 或多個 b 存在.
?: 表示前一個 char. set 的出現次數為 0 或 1 次. 如 ab?c 表示 a 與 c 之間可有 0 或 1 個 b 存在.
+: 表示前一個 char. set 的出現次數為 1 或多次. 如 ab+c 表示 a 與 c 之間可有 1 或多個 b 存在.
{n}: 表示前一個 char. set 的出現次數必須為 n 次. 如 ab{3,}c 表示 a 與 c 之間必須有 3 個 b 存在.{n,}: 表示前一個 char. set 的出現次數至少為 n 次. 如 ab{3,}c 表示 a 與 c 之間至少有 3 個 b 存在.
{n,m}: 表示前一個 char. set 的出現次數為 n 到 m 次. 如 ab{3,5}c 表示 a 與 c 之間有 3 到 5 個 b 存在.
. : 匹配任意一个字符(1个)
.*:匹配任意多个字符(1或多个)

sed全功略

1.sed用法介绍
s e d是一个非交互性文本流编辑器。它编辑文件或标准输入导出的文本拷贝。
• 抽取域。
• 匹配正则表达式。
• 比较域。
• 增加、附加、替换。
• 基本的s e d命令和一行脚本。
[notice]无论命令是什么, s e d并不与初始化文件打交道,它操作的只是一个拷贝,然后所有的改动如果没有重定向到一个文件,将输出到屏幕。

基本格式:
代码:
sed [-n] [-e] 'command' file(s) sed [-n] -f scriptfile file(s)
(1)sed怎样读取数据

s e d从文件的一个文本行或从标准输入的几种格式中读取数据,将之拷贝到一个编辑缓冲区,然后读命令行或脚本的第一条命令,并使用这些命令查找模式或定位行号编辑它。重复此过程直到命令结束。
(2)调用sed

调用s e d有三种方式:
a.在命令行键入命令; sed [选项] s e d命令输入文件
b.将s e d命令插入脚本文件,然后调用s e d; sed [选项] -f sed脚本文件输入文件
c.将s e d命令插入脚本文件,并使s e d脚本可执行。 sed脚本文件 [选项] 输入文件

2.sed选项
s e d选项如下:
-n 不打印;s e d不写编辑行到标准输出,缺省为打印所有行(编辑和未编辑)。p命令可以用来打印编辑行。
-f 如果正在调用s e d脚本文件,使用此选项。此选项通知s e d一个脚本文件支持所有的s e d命令,例如:sed -f myscript.sed input_file,这里m y s c r i p t . s e d即为支持s e d命令的文件。
-c 下一命令是编辑命令。使用多项编辑时加入此选项。如果只用到一条s e d命令,此选项无用,但指定它也没有关系。
-i 编辑原文件(此选项慎用,如果使用则原文件就会被修改,无法恢复)。

3.保存sed输出
a.重定向(下面将sed命令的所有输出至文件 output-file 中)
代码:
zhyfly@zhyfly:~$ sed 'some-sed-commands' input-file>output-file
b.
w 写文本到一个文件
代码:
[address [,address]]w filename
代码:
zhyfly@zhyfly:~/bash$ cat test.txt The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e '1,2w test.bak' test.txt The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ cat test.bak The honeysuckle band played all night long for only $90. It was an evening of splendid music and company.
另外,同样
r 从另一个文件中读文本
代码:
address r filename
4.sed基础用法
使用sed在文件中查询文本的方式:
sed浏览输入文件时,缺省从第一行开始,有两种方式定位文本:
a. 使用行号,可以是一个简单数字,或是一个行号范围。
b. 使用正则表达式。
例:
x #为一行号,如1
x,y #表示行号范围从x到y,如2,5表示从第2行到第5行
/pattern/ #查询包含模式的行。例如/disk/或/[a-z]/
/pattern/pattern/ #查询包含两个模式的行。例如/disk/disks/
/pattern/,x #在给定行号上查询包含模式的行。如/ribbon/,3
x,/pattern/ #通过行号和模式查询匹配行。3,/vdu/
x,y! #查询不包含指定行号x和y的行。1,2!

基本sed编辑命令:
sed编辑命令
p 打印匹配行
= 显示文件行号
a\ 在定位行号后附加新文本信息
i\ 在定位行号后插入新文本信息
d 删除定位行
c\ 用新文本替换定位文本
s 使用替换模式替换相应模式
r 从另一个文件中读文本
w 写文本到一个文件
q 第一个模式匹配完成后推出或立即推出
l 显示与八进制A S C I I代码等价的控制字符
{ } 在定位行执行的命令组
n 从另一个文件中读文本下一行,并附加在下一行
g 将模式2粘贴到/pattern n/
y 传送字符
n 延续到下一输入行;允许跨行的模式匹配语句

举例:
代码:
zhyfly@zhyfly:~/bash$ cat test.txt The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance.
代码:
zhyfly@zhyfly:~/bash$ sed -e '1p' test.txt The honeysuckle band played all night long for only $90. The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -n -e '1p' test.txt The honeysuckle band played all night long for only $90.
代码:
zhyfly@zhyfly:~/bash$ sed -n -e '2p' test.txt It was an evening of splendid music and company.
代码:
zhyfly@zhyfly:~/bash$ sed -n -e '2,3p' test.txt It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00.
代码:
zhyfly@zhyfly:~/bash$ sed -n -e '/company/p' test.txt It was an evening of splendid music and company.
代码:
zhyfly@zhyfly:~/bash$ sed -n -e '2,/23:00/p' test.txt It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00.
代码:
zhyfly@zhyfly:~/bash$ sed -n -e '2,3!p' test.txt The honeysuckle band played all night long for only $90. The local nurse Miss P.Neave was in attendance.
代码:
zhyfly@zhyfly:~/bash$ sed -e '=' test.txt 1 The honeysuckle band played all night long for only $90. 2 It was an evening of splendid music and company. 3 Too bad the disco floor fell through at 23:00. 4 The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -n -e '=' test.txt 1 2 3 4 zhyfly@zhyfly:~/bash$ sed -n -e '/music/p' test.txt It was an evening of splendid music and company. zhyfly@zhyfly:~/bash$ sed -n -e '/music/=' test.txt 2 zhyfly@zhyfly:~/bash$ sed -n -e '/music/p' -e '/music/=' test.txt It was an evening of splendid music and company. 2 zhyfly@zhyfly:~/bash$ sed -e '=;p' test.txt 1 The honeysuckle band played all night long for only $90. The honeysuckle band played all night long for only $90. 2 It was an evening of splendid music and company. It was an evening of splendid music and company. 3 Too bad the disco floor fell through at 23:00. Too bad the disco floor fell through at 23:00. 4 The local nurse Miss P.Neave was in attendance. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -n -e '=;p' test.txt 1 The honeysuckle band played all night long for only $90. 2 It was an evening of splendid music and company. 3 Too bad the disco floor fell through at 23:00. 4 The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -n -e '=' -e 'p' test.txt 1 The honeysuckle band played all night long for only $90. 2 It was an evening of splendid music and company. 3 Too bad the disco floor fell through at 23:00. 4 The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$
附加 [address]a\附加内容 #缺省放在每一行后面
代码:
zhyfly@zhyfly:~/bash$ sed -e 'a\this line will be added to the end of each line!oooooooooo' test.txt The honeysuckle band played all night long for only $90. this line will be added to the end of each line!oooooooooo It was an evening of splendid music and company. this line will be added to the end of each line!oooooooooo Too bad the disco floor fell through at 23:00. this line will be added to the end of each line!oooooooooo The local nurse Miss P.Neave was in attendance. this line will be added to the end of each line!oooooooooo zhyfly@zhyfly:~/bash$ sed -e '/music/a\this line will be added to the end of the matching line!oooooooooo' test.txt The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. this line will be added to the end of the matching line!oooooooooo Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance.
插入 [address]i\插入内容 #缺省放在每一行前面
代码:
zhyfly@zhyfly:~/bash$ sed -e 'i\this line will be inserted to the begin of each line!oooooooooo' test.txt this line will be inserted to the begin of each line!oooooooooo The honeysuckle band played all night long for only $90. this line will be inserted to the begin of each line!oooooooooo It was an evening of splendid music and company. this line will be inserted to the begin of each line!oooooooooo Too bad the disco floor fell through at 23:00. this line will be inserted to the begin of each line!oooooooooo The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e '/music/i\this line will be inserted to the begin of the matching line!oooooooooo' test.txt The honeysuckle band played all night long for only $90. this line will be inserted to the begin of the matching line!oooooooooo It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance.
更改行 [address]c\更改后整行内容 #缺省修改每一行
代码:
zhyfly@zhyfly:~/bash$ sed -e 'c\this line will be modified to the each line!oooooooooo' test.txt this line will be modified to the each line!oooooooooo this line will be modified to the each line!oooooooooo this line will be modified to the each line!oooooooooo this line will be modified to the each line!oooooooooo zhyfly@zhyfly:~/bash$ sed -e '/music/c\this line will be modified to the matching line!oooooooooo' test.txt The honeysuckle band played all night long for only $90. this line will be modified to the matching line!oooooooooo Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance.
删除定位行
代码:
zhyfly@zhyfly:~/bash$ cat test.txt The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e '/music/d' test.txt The honeysuckle band played all night long for only $90. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance.
5.sed高级用法---替换
替换 [code][address]s/old/new/g[code]

[address]s/pattern/replacement #the first occurrence on the address(缺省所有行)
[address]s/pattern/replacement/g #all occurrences on the address(缺省所有行)
代码:
zhyfly@zhyfly:~/bash$ cat test.txt The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e 's/h/ooooo/' test.txt Toooooe honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad toooooe disco floor fell through at 23:00. Toooooe local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e 's/h/ooooo/g' test.txt Toooooe ooooooneysuckle band played all nigooooot long for only $90. It was an evening of splendid music and company. Too bad toooooe disco floor fell tooooorougooooo at 23:00. Toooooe local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e '1,2s/h/ooooo/' test.txt Toooooe honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e '1,2s/h/ooooo/g' test.txt Toooooe ooooooneysuckle band played all nigooooot long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e '/\$/s/h/ooooo/g' test.txt Toooooe ooooooneysuckle band played all nigooooot long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e '/^It/,/23:00/s/l/ooooo/g' test.txt The honeysuckle band played all night long for only $90. It was an evening of spoooooendid music and company. Too bad the disco fooooooor feoooooooooo through at 23:00. The local nurse Miss P.Neave was in attendance.
使用替换修改字符串 &
代码:
zhyfly@zhyfly:~/bash$ cat test.txt The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e 's/$90/&230/g' test.txt The honeysuckle band played all night long for only $90230. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e 's/90/120,&/g' test.txt The honeysuckle band played all night long for only $120,90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance.
分隔符变换(避免产生歧义)
代码:
zhyfly@zhyfly:~/bash$ echo $PWD /home/zhyfly/bash zhyfly@zhyfly:~/bash$ echo $PWD|sed -e 's:/:@:g' @home@zhyfly@bash
规则表达式混乱
题目
zhyfly@zhyfly:~/bash$ cat test
This is what I meant.
想要得到的答案:
This is what I meant.
错误解法:
代码:
zhyfly@zhyfly:~/bash$ cat test This is what I meant. zhyfly@zhyfly:~/bash$ sed -e 's/<.*>//g' test This meant.
正确解法:
代码:
zhyfly@zhyfly:~/bash$ sed -e 's/<[^>]*>//g' test This is what I meant.
题目
要用sed把字符“\”转化成“'”该怎么写?
错误解法:
代码:
zhyfly@zhyfly:~/bash$ echo "hello\hh\haha" hello\hh\haha zhyfly@zhyfly:~/bash$ echo "hello\hh\haha"|sed -e 's/\\/'/g' > > bash: unexpected EOF while looking for matching `'' bash: syntax error: unexpected end of file zhyfly@zhyfly:~/bash$ echo "hello\hh\haha"|sed -e 's/\\/\'/g' > > bash: unexpected EOF while looking for matching `'' bash: syntax error: unexpected end of file
正确解法:
代码:
zhyfly@zhyfly:~/bash$ echo "hello\hh\haha"|sed -e 's/\\/'"'"'/g' hello'hh'haha zhyfly@zhyfly:~/bash$
你看出来了吗?

更多字符匹配:
'[a-x]*'
这将匹配零或多个全部为 'a'、'b'、'c'...'v'、'w'、'x' 的字符。另外,可以使用 '[:space:]' 字符类来匹配空格。
以下是可用字符类的相当完整的列表:
字符类 描述
代码:
[:alnum:] 字母数字 [a-z A-Z 0-9] [:alpha:] 字母 [a-z A-Z] [:blank:] 空格或制表键 [:cntrl:] 任何控制字符 [:digit:] 数字 [0-9] [:graph:] 任何可视字符(无空格) [:lower:] 小写 [a-z] [:print:] 非控制字符 [:punct:] 标点字符 [:space:] 空格 [:upper:] 大写 [A-Z] [:xdigit:] 十六进制数字 [0-9 a-f A-F]

关于域应用一个较复杂的例子讲解
题目:
[file1.txt]
1C2
1C3
1C31
1C32
1C4
2C3
2C4
1D1
1D10
1D12
1D2
1D3
1D31
1RC2
1RC20
1RC21
1RC3
1RC31
1WR1
1WR2
1WR20
1WR21
1WR23

排序后
[file2.txt]
1 C 2
1 C 3
2 C 3
1 C 4
2 C 4
1 C 31
1 C 32
1 D 1
1 D 2
1 D 3
1 D 10
1 D 12
1 D 31
1 RC 2
1 RC 3
1 RC 20
1 RC 21
1 RC 31
1 WR 1
1 WR 2
1 WR 20
1 WR 21
1 WR 23


规律:将每行分成三部分: “数字1” “字符串” “数字2”(即三个域)
第一、三字段按numberic顺序排序,中间部分按字母排序 ,优先级顺序2 3 1
解答思路:
首先需要将文件的每行分成三个域,这就利用到sed的分域功能,可以这样分(其中包含分域的格式):
代码:
/^\([0-9]*\)\([A-Z]*\)\([0-9]*\)/\1 \2 \3/
代码:
zhyfly@zhyfly:~/bash$ cat file1.txt 1C2 1C3 1C31 1C32 1C4 2C3 2C4 1D1 1D10 1D12 1D2 1D3 1D31 1RC2 1RC20 1RC21 1RC3 1RC31 1WR1 1WR2 1WR20 1WR21 1WR23
分域后
代码:
zhyfly@zhyfly:~/bash$ sed -e 's/^\([0-9]*\)\([A-Z]*\)\([0-9]*\)/\1 \2 \3 /g' file1.txt 1 C 2 1 C 3 1 C 31 1 C 32 1 C 4 2 C 3 2 C 4 1 D 1 1 D 10 1 D 12 1 D 2 1 D 3 1 D 31 1 RC 2 1 RC 20 1 RC 21 1 RC 3 1 RC 31 1 WR 1 1 WR 2 1 WR 20 1 WR 21 1 WR 23
排序后
代码:
zhyfly@zhyfly:~/bash$ sed -e 's/^\([0-9]*\)\([A-Z]*\)\([0-9]*\)/\1 \2 \3 /g' file1.txt|sort +1 -2 +2n +0 -1 1 C 2 1 C 3 2 C 3 1 C 4 2 C 4 1 C 31 1 C 32 1 D 1 1 D 2 1 D 3 1 D 10 1 D 12 1 D 31 1 RC 2 1 RC 3 1 RC 20 1 RC 21 1 RC 31 1 WR 1 1 WR 2 1 WR 20 1 WR 21 1 WR 23
下面的例子比较复杂:
题目
文件内容file.txt:
123456 345678 2005-05-06 123456
123456 234567 2003-5-6 234567
345555 987644 2003-4-23 543333
555555 999999 2004-11-5 999999

要将第四列数据变成正常的年月日,将2003-5-6 变成2003-05-0;
2003-4-23变成2003-04-23; 2004-11-5变成 2004-11-05

解答
首先将需要改变的部分分域
代码:
/-\([0-9]\)-/-0\1-/ #月 /-\([0-9]\) /-0\1 / #日
注意实现起来又有多种方法:
代码:
zhyfly@zhyfly:~/bash$ cat file.txt 123456 345678 2005-05-06 123456 123456 234567 2003-5-6 234567 345555 987644 2003-4-23 543333 555555 999999 2004-11-5 999999 zhyfly@zhyfly:~/bash$ sed -e 's/-\([0-9]\)-/-0\1-/g' -e 's/-\([0-9]\) /-0\1 /g' file.txt 123456 345678 2005-05-06 123456 123456 234567 2003-05-06 234567 345555 987644 2003-04-23 543333 555555 999999 2004-11-05 999999 zhyfly@zhyfly:~/bash$ sed -e 's/-\([0-9]\)-/-0\1-/g;s/-\([0-9]\) /-0\1 /g' file.txt 123456 345678 2005-05-06 123456 123456 234567 2003-05-06 234567 345555 987644 2003-04-23 543333 555555 999999 2004-11-05 999999 zhyfly@zhyfly:~/bash$ sed -e :a -e 's/-\([0-9]\)\([- ]\)/-0\1\2/;ta' file.txt 123456 345678 2005-05-06 123456 123456 234567 2003-05-06 234567 345555 987644 2003-04-23 543333 555555 999999 2004-11-05 999999
注:
:a - label
ta - goto to :a to rerun sed if the last sed is finished successfully.
简单说, 就是循环。

多个命令常用的方法
主要有
a.利用多个-e选项
b.利用 ;
c.使用脚本文件
上面几种方法,前面也有提到。
代码:
zhyfly@zhyfly:~/bash$ sed -e '2,3s/a/ooooo/g' -e '2,3s/d/ddddd/g' test.txt The honeysuckle band played all night long for only $90. It wooooos ooooon evening of splendddddiddddd music ooooonddddd compooooony. Too boooooddddd the dddddisco floor fell through ooooot 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -e '2,3s/a/ooooo/g;2,3s/d/ddddd/g' test.txt The honeysuckle band played all night long for only $90. It wooooos ooooon evening of splendddddiddddd music ooooonddddd compooooony. Too boooooddddd the dddddisco floor fell through ooooot 23:00. The local nurse Miss P.Neave was in attendance.
前两种方法比较简单,下面重点讲一下第三种方法
首先
创建sed脚本文件-->赋予执行权限-->执行文件
代码:
zhyfly@zhyfly:~/bash$ cat sub.sed #!/bin/sed -f 2,3s/a/ooooo/g 2,3s/d/ddddd/g zhyfly@zhyfly:~/bash$ sudo chmod +x sub.sed zhyfly@zhyfly:~/bash$ cat test.txt The honeysuckle band played all night long for only $90. It was an evening of splendid music and company. Too bad the disco floor fell through at 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ ./sub.sed test.txt The honeysuckle band played all night long for only $90. It wooooos ooooon evening of splendddddiddddd music ooooonddddd compooooony. Too boooooddddd the dddddisco floor fell through ooooot 23:00. The local nurse Miss P.Neave was in attendance. zhyfly@zhyfly:~/bash$ sed -f sub.sed test.txt The honeysuckle band played all night long for only $90. It wooooos ooooon evening of splendddddiddddd music ooooonddddd compooooony. Too boooooddddd the dddddisco floor fell through ooooot 23:00. The local nurse Miss P.Neave was in attendance.
阅读(1629) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~