开始正文之前, 推荐下这里有个介绍awk数组的精华帖:
grep 1083628889 XXYY.TamServer_updateVipAmount_20121227.log | tr -d '][' | awk 'BEGIN{ FS="|" }{ match($10, / money=[0-9]+*/, a); match($10, / vip=[0-9]+*/, b); print $4,a[0],b[0] }'
match(s, r [, a])
Returns the position in s where the regular expression r occurs, or 0 if r is not present, and sets the values of RSTART and RLENGTH. Note that the argument order is the same as for the ~ operator: str ~ re. If array a is provided, a is cleared and then elements 1 through n are filled with the portions of s that match the corresponding parenthesized subexpression in r. The 0'th element of a contains the portion of s matched by the entire regular expression r. Subscripts a[n, "start"], and a[n, "length"] provide the starting index in the string and length respectively, of each matching substring.
两种用法:
1. 普通用法
match(字符串,正则表达式)
内置变量RSTART表示匹配开始的位置,RLENGTH表示匹配的长度
如果匹配到了,返回匹配到的开始位置,否则返回0
$ awk 'BEGIN{start=match("Abc Ef Kig",/ [A-Z][a-z]+ /);print RSTART,RLENGTH}'
4 4
2. 建立数组(If array a is provided, a is cleared and then elements 1 through n are filled with the portions of s that match the corresponding
parenthesized subexpression in r. The 0'th element of a contains the portion of s matched by the entire regular
expression r. Subscripts a[n, "start"], and a[n, "length"] provide the starting index in the string and length
respectively, of each matching substring.)
echo "foooobazbarrrrr | gawk '{ match($0, /(fo+).+(bar*)/, arr) #匹配到的部分自动赋值到arr中,下标从1开始
print arr[1], arr[2]
print arr[1, "start"], arr[1, "length"] #二维数组arr[index,"start"]值=RSTART
print arr[2, "start"], arr[2, "length"] #二维数组arr[index,"length"]值=RLENGTH
}'
foooo barrrrr
1 5
9 7
阅读(16907) | 评论(0) | 转发(1) |