Sed One-liner Explained-失败的阿明-ChinaUnix博客

失败的阿明cucm.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

失败的阿明

博客访问： 38162
博文数量： 22
博客积分： 285
博客等级：二等列兵
技术积分： 237
用户组：普通用户
注册时间： 2012-12-18 20:02

个人简介

人生最大的悲哀莫过于迷失自我而无法自拔！

文章分类

全部博文（22）

人生小札（2）
XML（1）
HTML（2）
python（5）
perl（2）
Bash（3）
Sed（4）
curl（1）
Awk（2）
未分配的博文（0）

文章存档

2013年（11）

2012年（11）

我的朋友

相关博文

Sed One-liner Explained

分类： LINUX

2012-12-21 19:04:37

Sed one-liner Explained

One：File spacing.

1. Double-space a file.

sed G filename：在每行后面添加一个空行

翻译：

G命令取出hold space中的内容然后追加到当前的pattern space内容之后，

由于没有使用h,H进行操作hold space，所以默认hold space中的内容是一个newline，

也就是在文件中每行的后面添加一个newline然后发送到output stream输出。

示例：

[root@web-db bash]# cat 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]# sed G 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

2. Double-space a file which already has blank lines in it.

Do it so that the output contains no more than one blank line between

two lines of text.

sed '/^$/d;G' filename：如果是空行删除，非空行的行后添加一个空行并输出。

如果是空行，那么会删除空行，然后进行下一行的处理，如果不是空行，那么会在当前行

后添加一个newline然后输出。

示例：

[root@web-db bash]# sed '/^$/d;G' 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

3. Triple-space a file.

sed 'G;G' filename：每行后添加两个空行

示例：

[root@web-db bash]# sed 'G;G' 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

4. Undo double-spacing.

sed 'n;d' filename：打印奇数行：

This one-liner assumes that even-numbered lines are always blank.

It uses two new commands - 'n' and 'd'.

The 'n' command prints out the current pattern space

(unless the '-n' flag has been specified),

empties the current pattern space and reads in the next line of input.

We assumed that even-numbered lines are always blank.

This means that 'n' prints the first, third, fifth, ..., etc. line and

reads in the following line.

The line following the printed line is always an empty line.

Now the 'd' command gets executed.

The 'd' command deletes the current pattern space,

reads in the next line,

puts the new line into the pattern space and aborts the current command,

and starts the execution from the first sed command.

Now the the 'n' commands gets executed again, then 'd', then 'n', etc.

示例：

[root@web-db bash]# sed 'n;d' 1.1

20081010 1123 xxx

20081012 0933 xyz

20081013 0717 def

[root@web-db bash]#

sed -n 'n;p' filename：打印偶数行

[root@web-db bash]# seq 12 | sed -n 'n;p'

5. Insert a blank line above every line that matches "regex".

sed '/regex/{x;p;x;}' filename：在匹配行之前插入一个空行

x：交换hold space和pattern space中的内容，相当于匹配行的内容换成了一个newline，

然后p命令输出了newline，之后又一次x交换，把之前的交换内容复原，由于sed没有使用

-n（取消默认输出），x之后也没有其他命令，所有会输出第二个x交换之后回到

pattern space的内容。

Also notice the { ... }. This is command grouping. It says, e

xecute all the commands in "..." on the line

that matches the restriction operation.

6. Insert a blank line below every line that matches "regex".

sed '/regex/G' filename：在匹配行之后插入一个空行

This one liner combines restriction operation with the 'G' command,

described in one-liner #1. For every line that matches /regex/,

sed appends a newline to pattern space. All the other lines that do not

match /regex/ just get printed out without modification.

示例：

[root@web-db bash]# sed '/xyz/G' 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

7. Insert a blank line above and below every line that matches "regex".

sed '/regex/{x;p;x;G;}' fliename：匹配行前后添加一个空行

This one-liner combines one-liners #5, #6 and #1.

Lines matching /regex/ get a newline appended before them

and printed (x;p;x from #5). Then they are followed by another newline

from the 'G' command (one-liner #6 or #1).

示例：

[root@web-db bash]# sed '/xyz/{x;p;x;G;}' 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

Two：Numbering.

1. Number each line of a file (named filename). Left align the number.

sed = filename | sed 'N;s/\n/\t/'：添加行号

首先使用=打印出每行的行号，然后使用N命令将行号与行内容合并到一行，但是行号与

行内容之间是以一个"\n"来分割的。

示例：

[root@web-db bash]# seq 3 4 | sed =

[root@web-db bash]#

[root@web-db bash]# seq 3 4 | sed = | sed 'N;s/\n/\t/'

1 3

2 4

[root@web-db bash]#

2. Number each line of a file (named filename). Right align the number.

sed = filename | sed 'N; s/^/ /; s/ *$.\{6,\}$\n/\1 /'

This one-liner is also actually two one-liners.

The first one liner numbers the lines, just like #1.

The second one-liner uses the 'N' command to join the line containing

the line number with the actual line.

Then it uses two substitute commands to right align the number.

The first 's' command 's/^/ /' appends 5 white-spaces to the beginning of line.

The second 's' command 's/ *$.\{6,\}$\n/\1 /' captures

at least six symbols up to a newline and replaces the capture and

newline with the back-reference '\1' and two more whitespace to separate

line number from the contents of line.

I think it's hard to understand the last part of this sed expression

by just reading. Let's look at an example.

For clearness I replaced the '\n' newline char with a '@' and

whitespace with '-'.

$ echo "-----12@contents" | sed 's/-*$.\{6,\}$@/\1--/'

----12--contents

The regular expression '-*$.\{6,\}$@' (or just '-*(.{6,})@')

tells sed to match some '-' characters followed by at least 6 other

characters, followed by a '@' symbol.

Sed captures them (remembers them) in \1.

In this example sed matches the first '-' (the '-*' part of regex),

then the following six characters "----12" and '@' (the '*$.\{6,\}$@'

part of regex). Now it replaces the matched part of the string "-----12@"

with the contents of captured group which is "----12"

plus two extra whitespace. The final result is that "-----12@"

gets replaced with "----12--".

3. Number each non-empty line of a file (called filename).

sed '/./=' filename | sed '/./N; s/\n/ /'

行号与行内容之间空一格空格，对于空行按原样输出。

示例：

[root@web-db bash]# cat 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]# sed '/./=' 1.1 | sed '/./N;s/\n/ /'

1 20081010 1123 xxx

2 20081011 1234 def

4 20081012 0933 xyz

5 20081013 0512 abc

6 20081013 0717 def

[root@web-db bash]#

4. Count the number of lines in a file (emulates "wc -l").

sed -n '$=' filename：计算文件行数

示例：

[root@web-db bash]# sed -n '$=' 1.1

[root@web-db bash]#

Three. Text Conversion and Substitution.文本转换和替换

5. Convert DOS/Windows newlines (CRLF) to Unix newlines (LF).

sed 's/.$//'

This one-one liner assumes that all lines end with

CR+LF (carriage return + line feed) and we are in a Unix environment.

Once the line gets read into pattern space,

the newline gets thrown away, so we are left with lines ending in CR.

The 's/.$//' command erases the last character by

matching the last character of the line (regex '.$') and

substituting it with nothing. Now when the pattern space gets output,

it gets appended the newline and we are left with lines ending with LF.

The assumption about being in a Unix environment is necessary

because the newline that gets appended when the pattern space gets

copied to output stream is the newline of that environment.

6. Another way to convert DOS/Windows newlines (CRLF) to Unix newlines (LF).

sed 's/^M$//'

This one-liner again assumes that we are in a Unix environment.

It erases the carriage return control character ^M.

You can usually enter the ^M control char literally by first

pressing Ctrl-V (it's control key + v key) and then Ctrl-M.

7. Yet another way to convert DOS/Windows newlines to Unix newlines.

sed 's/\x0D$//'

This one-liner assumes that we are on a Unix machine.

It also assumes that we use a version of sed that supports hex escape codes,

such as GNU sed. The hex value for CR is 0x0D (13 decimal).

This one-liner erases this character.

8. Convert Unix newlines (LF) to DOS/Windows newlines (CRLF).

sed "s/$/`echo -e \\\r`/"

This one-liner also assumes that we are in a Unix environment.

It calls shell for help. The 'echo -e \\\r' command

inserts a literal carriage return character in the sed expression.

The sed "s/$/char/" command appends a character

to the end of current pattern space.

9. Another way to convert Unix newlines (LF) to DOS/Windows newlines (CRLF).

sed 's/$/\r/'

This one-liner assumes that we use GNU sed.

GNU sed is smarter than other seds and can take escape characters

in the replace part of s/// command.

10. Convert Unix newlines (LF) to DOS/Windows newlines (CRLF) from DOS/Windows.

sed "s/$//"

This one-liner works from DOS/Windows.

It's basically a no-op one-liner.

It replaces nothing with nothing and then sends out the line to

output stream where it gets CRLF appended.

11. Another way to convert Unix newlines (LF) to DOS/Windows newlines (CRLF) from DOS/Windows.

sed -n p

This is also a no-op one-liner, just like #10.

The shortest one-liner which does the same is:

sed ''

12. Convert DOS/Windows newlines (LF) to Unix format (CRLF) from DOS/Windows.

sed "s/\r//"

Eric says that this one-liner works only with UnxUtils sed v4.0.7 or higher.

I don't know anything about this version of sed, so let's just trust him.

This one-liner strips carriage return (CR) chars from lines.

Then when they get output, CRLF gets appended by magic.

Eric mentions that the only way to convert LF to CRLF on a DOS machine

is to use tr:

tr -d \r outfile

13. Delete leading whitespace (tabs and spaces) from each line.

sed 's/^[ \t]*//' filename：删除开头的whitespaces

Pretty simple, it matches zero-or-more spaces and tabs

at the beginning of the line and replaces them with nothing,

i.e. erases them.

14. Delete trailing whitespace (tabs and spaces) from each line.

sed 's/[ \t]*$//' filename：删除行尾的whitespaces

This one-liner is very similar to #13.

It does the same substitution, just matching zero-or-more spaces and tabs

at the end of the line, and then erases them.

15. Delete both leading and trailing whitespace from each line.

sed 's/^[ \t]*//;s/[ \t]*$//' filename：删除开头和行尾的whitespaces

This one liner combines #13 and #14. First it does what #13 does,

erase the leading whitespace, and then it does the same as #14,

erase trailing whitespace.

16. Insert five blank spaces at the beginning of each line.

sed 's/^/ /' filename：每行的开头添加5个空格

It does it by matching the null-string at the beginning of line (^) and

replaces it with five spaces " ".

17. Align lines right on a 79-column width.

sed -e :a -e 's/^.\{1,78\}$/ &/;ta' fliename：行宽为79个字符右对齐

This one-liner uses a new command line option and two new commands.

The new command line option is '-e'.

It allows to write a sed program in several parts.

For example, a sed program with two substitution rules could

be written as "sed -e 's/one/two/' -e 's/three/four'" instead of

"sed 's/one/two/;s/three/four'". It makes it more readable.

In this one-liner the first "-e" creates a label called "a".

The ':' command followed by a name crates a named label.

The second "-e" uses a new command "t".

The "t" command branches to a named label

if the last substitute command modified pattern space.

This branching technique can be used to create loops in sed.

In this one-liner the substitute command

left-pads the string (right aligns it) a single whitespace at a time,

until the total length of the string exceeds 78 chars.

The "&" in substitution command means the matched string.

示例：

[root@web-db bash]# sed -e :a -e 's/^.\{1,78\}$/ &/;ta' 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

Translating it in modern language, it would look like this:

while (str.length() <= 78) {

str = " " + str

}

18. Center all text in the middle of 79-column width. 行宽79个字符，居中

sed -e :a -e 's/^.\{1,77\}$/ & /;ta'

This one-liner is very similar to #17,

but instead of left padding the line one whitespace character at a time

it pads it on both sides until it has reached length of at least 77 chars.

Then another two whitespaces get added at the last iteration and

it has grown to 79 chars.

示例：

[root@web-db bash]# sed -e :a -e 's/^.\{1,77\}$/ & /;ta' 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

Another way to do the same is

sed -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/$ *$\1/\1/'

This one-liner left pads the string one whitespace char

at a time until it has reached length of 78 characters.

Then the additional "s/$ *$\1/\1/" command gets executed

which divides the leading whitespace "in half".

This effectively centers the string.

Unlike the previous one-liner this one-liner does not add trailing whitespace.

It just adds enough leading whitespace to center the string.

示例：

[root@web-db bash]# sed -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/$ *$\1/\1/' 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

19. Substitute (find and replace) the first occurrence of "foo" with "bar" on

each line.

sed 's/foo/bar/'

This is the simplest sed one-liner possible.

It uses the substitute command and applies it once on each line.

It substitutes string "foo" with "bar".

示例：

[root@web-db bash]# cat foo

foo hello world foo

hello foo world yes foo

hello world foo

[root@web-db bash]# sed 's/foo/bar/' foo

bar hello world foo

hello bar world yes foo

hello world bar

[root@web-db bash]#

20. Substitute (find and replace) the fourth occurrence of "foo" with "bar"

on each line.

sed 's/foo/bar/4'

This one-liner uses a flag for the substitute command.

With no flags the first occurrence of pattern is changed.

With a numeric flag like "/1", "/2", etc.

only that occurrence is substituted.

This one-liner uses numeric flag "/4" which makes it change fourth occurrence

on each line.

示例：

[root@web-db bash]# sed 's/foo/bar/2' foo

foo hello world bar

hello foo world yes bar

hello world foo

[root@web-db bash]#

21. Substitute (find and replace) all occurrence of "foo" with "bar" on each line.

sed 's/foo/bar/g'

This one-liner uses another flag. The "/g" flag which stands for global.

With global flag set,

substitute command does as many substitutions as possible, i.e., all.

示例：

[root@web-db bash]# sed 's/foo/bar/g' foo

bar hello world bar

hello bar world yes bar

hello world bar

[root@web-db bash]#

22. Substitute (find and replace) the first occurrence of a repeated occurrence of "foo" with "bar".

sed 's/$.*$foo$.*foo$/\1bar\2/'

Let's understand this one-liner with an example:

示例：

[root@web-db bash]# echo "this is foo and another foo quux" | sed 's/$.*$foo$.*foo$/\1bar\2/'

this is bar and another foo quux

[root@web-db bash]#

As you can see, this one liner replaced the first "foo" with "bar".

It did it by using two capturing groups.

The first capturing group caught everything before the first "foo".

In this example it was text "this is ".

The second group caught everything after the first "foo",

including the second "foo". In this example " and another foo".

The matched text was then replaced with contents of first group "this is "

followed by "bar" and contents of second group " and another foo".

Since " quux" was not part of the match it was left unchanged.

Joining these parts the resulting string is "this is bar and another foo quux",

which is exactly what we got from running the one-liner.

23. Substitute (find and replace) only the last occurrence of "foo" with "bar".

sed 's/$.*$foo/\1bar/'

This one-liner uses a capturing group that captures everything up to "foo".

It replaces the captured group and "foo" with captured group itself

(the \1 back-reference) and "bar".

It results in the last occurrence of "foo" getting replaced with "bar".

示例：

[root@web-db bash]# echo "this is foo" | sed 's/$.*$foo/\1bar/'

this is bar

[root@web-db bash]#

24. Substitute all occurrences of "foo" with "bar" on all lines that contain "baz".

sed '/baz/s/foo/bar/g'

This one-liner uses a regular expression to restrict the substitution to

lines matching "baz". The lines that do not match "baz" get simply

printed out, but those that do match "baz" get the substitution applied.

示例：

[root@web-db bash]# sed '/baz/s/foo/bar/g' foo

foo hello world foo

baz hello bar world yes bar

hello world foo

[root@web-db bash]#

25. Substitute all occurrences of "foo" with "bar" on all lines that DO NOT contain "baz".

sed '/baz/!s/foo/bar/g'

Sed commands can be inverted and applied on lines that DO NOT match a certain

pattern. The exclamation "!" before a sed commands does it. In this one-liner

the substitution command is applied to the lines that DO NOT match "baz".

示例：

[root@web-db bash]# sed '/baz/!s/foo/bar/g' foo

bar hello world bar

baz hello foo world yes foo

hello world bar

[root@web-db bash]#

26. Change text "scarlet", "ruby" or "puce" to "red".

sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g'

This one-liner just uses three consecutive substitution commands.

The first replaces "scarlet" with "red",

the second replaced "ruby" with "red" and the last one replaces "puce"

with "red".

示例：

[root@web-db bash]# cat foo

foo hello world foo

baz hello foo world yes foo

hello world foo

[root@web-db bash]# sed 's/foo/Hello/g;s/world/Hello/g;s/hello/Hello/g' foo

Hello Hello Hello Hello

baz Hello Hello Hello yes Hello

Hello Hello Hello

[root@web-db bash]#

If you are using GNU sed, then you can do it simpler:

sed -r 's/scarlet|ruby|puce/red/g'使用sed的扩展正则

示例：

[root@web-db bash]# sed -r 's/foo|hello|world/Hello/g' foo

Hello Hello Hello Hello

baz Hello Hello Hello yes Hello

Hello Hello Hello

[root@web-db bash]#

GNU sed provides more advanced regular expressions which support alternation. This one-liner uses alternation and the substitute command reads "replace 'scarlet' OR 'ruby' OR 'puce' with 'red'".

27. Reverse order of lines (emulate "tac" Unix command).

sed '1!G;h;$!d'

This one-liner acts as the "tac" Unix utility.

It's tricky to explain. The easiest way to explain it is by using an example.

Let's use a file with just 3 lines:

[root@web-db bash]# echo -e "1\n2\n3" | sed '1!G;h;$!d'

[root@web-db bash]#

The first one-liner's command "1!G" gets applied to all the lines

which are not the first line.

The second command "h" gets applied to all lines.

The third command "$!d" gets applied to all lines except the last one.

Let's go through the execution line by line.

Line 1: Only the "h" command gets applied for the first line "1".

It copies this line to hold buffer. Hold buffer now contains "foo".

Nothing gets output as the "d" command gets applied.

Line 2: The "G" command gets applied. It appends the contents of hold buffer to pattern space.

The pattern space now contains. "2\n1". The "h" command gets applied, it copies "2\n1" to hold buffer.

It now contains "2\n1". Nothing gets output.

Line 3: The "G" command gets applied. It appends hold buffer to the third line.

The pattern space now contains "3\n2\n1". As this was the last line,

"d" does not get applied and the contents of pattern space gets printed.

It's "3\n2\n1". File got reversed.

If we had had more lines, they would have simply get appended to hold buffer

in reverse order.

Here is another way to do the same:

sed -n '1!G;h;$p'

It silences the output with "-n" switch and forces the output with "p"

command only at the last line.

示例：

[root@web-db bash]# echo -e "1\n2\n3" | sed -n '1!G;h;$p'

[root@web-db bash]#

These two one-liners actually use a lot of memory because they keep the whole

file in hold buffer in reverse order before printing it out.

Avoid these one-liners for large files.

28. Reverse a line (emulates "rev" Unix command).

sed '/\n/!G;s/$.$$.*\n$/&\2\1/;//D;s/.//'

示例：

[root@web-db bash]# sed '/\n/!G;s/$.$$.*\n$/&\2\1/;//D;s/.//' 1.1

xxx 3211 01018002

fed 4321 11018002

zyx 3390 21018002

cba 2150 31018002

fed 7170 31018002

[root@web-db bash]#

This is a very complicated one-liner.

I had trouble understanding it the first time I saw it and

ended up asking on comp.unix.shell for help.

Let's re-format this sed one-liner:

sed '

/\n/ !G

s/$.$$.*\n$/&\2\1/

//D

s/.//

The first line "/\n/ !G" appends a newline to the end of the pattern space if there was none.

The second line "s/$.$$.*\n$/&\2\1/" is a simple s/// expression

which groups the first character as \1 and all the others as \2.

Then it replaces the whole matched string with "&\2\1",

where "&" is the whole matched text ("\1\2"). For example,

if the input string is "1234" then after the s/// expression,

it becomes "1234\n234\n1".

The third line is "//D". This statement is the key in this one-liner.

An empty pattern // matches the last existing regex,

so it's exactly the same as: /$.$$.*\n$/D.

The "D" command deletes from the start of the input till the first newline

and then resumes editing with first command in script.

It creates a loop. As long as /$.$$.*\n$/ is satisfied,

sed will resume all previous operations. After several loops,

the text in the pattern space becomes "\n4321".

Then /$.$$.*\n$/ fails and sed goes to the next command.

The fourth line "s/.//" removes the first character in the pattern space

which is the newline char. The contents in pattern space becomes "4321" -- reverse of "1234".

There you have it, a line has been reversed.

29. Join pairs of lines side-by-side (emulates "paste" Unix command).

sed '$!N;s/\n/ /'

This one-liner joins two consecutive lines with the "N" command.

They get joined with a "\n" character between them.

The substitute command replaces this newline with a space,

thus joining every pair of lines with a whitespace.

示例：

[root@web-db bash]# echo -e "1\n2\n3" | sed '$!N;s/\n/ /'

1 2

[root@web-db bash]#

30. Append a line to the next if it ends with a backslash "\".

sed -e :a -e '/\\$/N; s/\\\n//; ta'

The first expression ':a' creates a named label "a".

The second expression looks to see if the current line ends with a backslash "\".

If it does, it joins it with the line following it using the "N" command.

Then the slash and the newline between joined lines get erased with "s/\\\n//" command.

If the substitution was successful we branch to the beginning of

expression and do the same again, in hope that we might have another backslash.

If the substitution was not successful, the line did not end with a backslash and we print it out.

Here is an example of running this one-liner:

[root@web-db bash]# cat 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz \

20081013 0512 abc

20081013 0717 def

[root@web-db bash]# sed -e :a -e '/\\$/N;s/\\\n//;ta' 1.1

20081010 1123 xxx

20081011 1234 def

20081012 0933 xyz 20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

Lines one and two got joined because the first line ended with backslash.

31. Append a line to the previous if it starts with an equal sign "=".

sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'

This one-liner also starts with creating a named label "a".

Then it tests to see if it is not the last line and appends

the next line to the current one with "N" command.

If the just appended line starts with a "=",

one-liner branches the label "a" to see if there are more lines starting with "=".

During this process a substitution gets executed which throws away

the newline character which came from joining with "N" and the "=".

If the substitution fails, one-liner prints out the pattern space up

to the newline character with the "P" command,

and deletes the contents of pattern space up to

the newline character with "D" command, and repeats the process.

Here is an example of running it:

[root@web-db bash]# cat 1.1

20081010 1123 xxx

20081011 1234 def

=20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]# sed -e :a -e 'N;s/\n=/ /;ta' -e 'P;D' 1.1

20081010 1123 xxx

20081011 1234 def 20081012 0933 xyz

20081013 0512 abc

20081013 0717 def

[root@web-db bash]#

Lines one, two and three got joined,

because lines two and three started with '='. Line four got printed as-is.

32. Digit group (commify) a numeric string.使用逗号隔开数字（千分位）

sed -e :a -e 's/$.*[0-9]$$[0-9]\{3\}$/\1,\2/;ta'

This one-liner turns a string of digits, such as "1234567" to "1,234,567".

This is called commifying or digit grouping.

First the one-liner creates a named label "a".

Then it captures two groups of digits.

The first group is all the digits up to last three digits.

The last three digits gets captures in the 2nd group.

Then the two matching groups get separated by a comma.

Then the same rules get applied to the line again and again

until all the numbers have been grouped in groups of three.

Substitution command "\1,\2" separates contents of group one with a comma from the contents of group two.

Here is an example to understand the grouping happening here better.

Suppose you have a numeric string "1234567".

The first group captures all the numbers until the last three "1234".

The second group captures last three numbers "567".

They get joined by a comma. Now the string is "1234,567".

The same stuff is applied to the string again.

Number "1" gets captured in the first group and the numbers "234" in the second.

The number string is "1,234,567".

Trying to apply the same rules again fail

because there is just one digit at the beginning of string,

so the string gets printed out and sed moves on to the next line.

示例：

[root@web-db bash]# echo 1234567890 | sed -e :a -e 's/$.*[0-9]$$[0-9]\{3\}$/\1,\2/;ta'

1,234,567,890

[root@web-db bash]#

If you have GNU sed, you can use a simpler one-liner:

gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta'

This one-liner starts with creating a named label "a" and

then loops over the string the same way as the previous one-liner did.

The only difference is how groups of three digits get matched.

GNU sed has some additional patterns.

There are two patterns that make this one-liner work.

The first is "\B", which matches anywhere except at a word boundary.

It's needed so we did not go beyond word boundary. Look at this example:

$ echo "12345 1234 123" | sed 's/[0-9]\{3\}\>/,&/g'

12,345 1,234 ,123

It's clearly wrong. The last 123 got a comma added.

Adding the "\B" makes sure we match the numbers only at word boundary:

$ echo "12345 1234 123" | sed 's/\B[0-9]\{3\}\>/,&/g'

12,345 1,234 123

The second is "\>". It matches the null string at the end of a word.

It's necessary because we need to to match the right-most three digits.

If we did not have it, the expression would match after the first digit.

33. Add commas to numbers with decimal points and minus signs.千分位（考虑到小数）

sed -r ':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta'

示例：

[root@web-db bash]# echo 1234567890.123789| sed -r ':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta'

1,234,567,890.123789

[root@web-db bash]#

This one-liner works in GNU sed only.

It turns on extended regular expression support with the "-r" switch.

Then it loops over a line matching three groups and

separates the first two from the third with a comma.

The first group makes sure we ignore a leading non-digit character,

such as + or -. If there is no leading non-digit character,

then it just anchors at the beginning of the string which always matches.

The second group matches a bunch of numbers.

The third group makes sure the second group does not match too many.

It matches 3 consecutive numbers at the end of the string.

Once the groups have been captured, the "\1\2,\3" substitution is done

and the expression is looped again,

until the whole string has been commified.

43. Add a blank line after every five lines.

sed 'n;n;n;n;G;'

The "n" command is called four times in this one-liner.

Each time it's called it prints out the current pattern space,

empties it and reads in the next line of input. After calling it four times,

the fifth line is read into the pattern space and then the "G" command

gets called. The "G" command appends a newline to the fifth line.

Then the next round of four "n" commands is done.

Next time the first "n" command is called it prints out the newlined fifth

line, thus inserting a blank line after every 5 lines.

示例：

[root@web-db bash]# seq 10 | sed 'n;n;n;n;G;'

[root@web-db bash]#

The same can be achieved with GNU sed's step extension:

gsed '0~5G'

GNU sed's step extensions can be generalized as "first~step".

It matches every "step"'th line starting with line "first".

In this one-liner it matches every 5th line starting with line 0.

示例：

[root@web-db bash]# seq 10 | sed '0~5G'

[root@web-db bash]#

另附：Awk的方法：

[root@web-db bash]# seq 10 | awk '1;!(NR%5){print ""}'

[root@web-db bash]#

阅读(506) | 评论(0) | 转发(0) |

上一篇：Sed Advanced Commands and Label 小笔头

下一篇：BASH RANDOM产生随机数

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6