二. Conditionals, Loops, and Arrays
1. 条件判断
if ( expression )
当expression不为空或者为真(非零)的时候 执行action1, 否则执行action2.
eg: if ( x ) print x
if ( x == y ) print x
if ( x ~ /[yY](es)?/ ) print x
if ( avg >= 65 )
grade = "Pass"
grade = "Fail"
if (avg >= 90) grade = "A"
else if (avg >= 80) grade = "B"
else if (avg >= 70) grade = "C"
else if (avg >= 60) grade = "D"
else grade = "F"
expr ? action1 : action2
eg: grade = (avg >= 65) ? "Pass" : "Fail"
2. 循环
1) while,
i = 1
while ( i <= 4 ) {
print $i
2)Do 循环
do {
print x
} while ( x <= 4 )
3) For循环
total = 0
for (i = 2; i <= NF; ++i)
total += $i
avg = total / (NF - 1)
3. 其他影响循环的关键字
1) break, continue
例如: NF != 4 {
printf("line %d skipped: doesn't have 4 fields", FNR) > "/dev/stderr"
3) exit
exit语言会强制awk停止目前正在处理的记录,并停止读入输入行。任何剩余的行都将被忽略。 如果exit有参数,则该参数将作为Awk的退出码。
例如: awk '{
exit 5 ##状态退出码为5
4. 数组
1) 定义形式: array[index] = value
eg: flavor[1] = "cherry"
eg: flavor_count = 5
for (x = 1; x <= flavor_count; ++x)
print flavor[x]
2) 关联数组(associative arrays)
eg: array[$1] = $2
对于数组的特殊循环语法: for ( variable in array )
do something with array[variable]
eg: for ( item in acro )
print item, acro[item]
语法: item in array ##如果array[item]存在返回1,否则返回0
5, A Glossary Lookup Script
1) 实例: 程序文件lookup的内容如下:
awk '# lookup -- reads local glossary file and prompts user for query
BEGIN { FS = "\t"; OFS = "\t"
# prompt user
printf("Enter a glossary term: ")
#1 read local file named glossary
FILENAME == "glossary" {
# load each glossary entry into an array
entry[$1] = $2
#2 scan for command to exit program
$0 ~ /^(quit|[qQ]|exit|[Xx])$/ { exit }
#3 process any non-empty line
$0 != "" {
if ( $0 in entry ) {
# it is there, print definition
print entry[$0]
} else
print $0 " not found"
#4 prompt user again for another term
printf("Enter another glossary term (q to quit): ")
}' glossary -
解释:首先awk在读入任何输入之前执行BEGIN中的语句,打印输入提示。然后开始从第一文件gloassary中读入各行。此时的FILENAME为glossary, 此时因为规则1中的next, 后面的几条规则都不会执行。在读入完毕后,则开始从第二个文件‘-’(既标准输入)中读入,此时的FILENAME=-。 然后开始从第2条规则开始验证了。注意,如果在标准输入,你只是仅仅敲入Enter,则awk会循环等待。直到遇到文件结束符或着退出。
$cat glossary
BASIC Beginner's All-Purpose Symbolic Instruction Code
CICS Customer Information Control System
COBOL Common Business Oriented Language
DBMS Data Base Management System
GIGO Garbage In, Garbage Out
GIRL Generalized Information Retrieval Language
$ ./lookup
Enter a glossary term: GIGO
Garbage in, garbage out
Enter another glossary term (q to quit): BASIC
Beginner's All-Purpose Symbolic Instruction Code
Enter another glossary term (q to quit): q
2) 用split()创建数组
语法: n = split(string, array, separator)
这里split()是awk的内建函数, string是要分析的字符串。 array是将string分裂后的各个小部分存放的数组,separator是分割符,n是array的长度。
eg: z = split($1, fullname, " ")
这样: fullname[1]=firstname, fullname[z]=lastname
eg: z = split($1, array, " ")
for (i = 1; i <= z; ++i)
print i, array[i]
3) 从数组删除元素
delete array[index]
4) Making Conversions
脚本文件: $cat romanum
echo $1 |
awk '# romanum -- convert number 1-10 to roman numeral
# define numerals as list of roman numerals 1-10
# create array named numerals from list of roman numerals
split("I,II,III,IV,V,VI,VII,VIII,IX,X", numerals, ",")
# look for number between 1 and 10
$1 > 0 && $1 <= 10 {
# print specified element
print numerals[$1]
{ print "invalid number"
运行: $ romanum 4
awk '
# date-month -- convert mm/dd/yy or mm-dd-yy to month day, year
# build list of months and put in array.
# the 3-step assignment is done for printing in book
listmonths = "January,February,March,April,May,June,"
listmonths = listmonths "July,August,September,"
listmonths = listmonths "October,November,December"
split(listmonths, month, ",")
# check that there is input
$1 != "" {
# split on "/" the first input field into elements of array
sizeOfArray = split($1, date, "/")
# check that only one field is returned
if (sizeOfArray == 1)
# try to split on "-"
sizeOfArray = split($1, date, "-")
# must be invalid
if (sizeOfArray == 1)
# add 0 to number of month to coerce numeric type
date[1] += 0
# print month day, year
print month[date[1]], (date[2] ", 19" date[3])
运行:$ echo "5/11/55" | date-month
May 11, 1955
1.build-in function
1)Arithmetic Functions
cos(x) Returns cosine of x (x is in radians).
exp(x) Returns e to the power x.
int(x) Returns truncated value of x.
log(x) Returns natural logarithm (base-e) of x.
sin(x) Returns sine of x (x is in radians).
sqrt(x) Returns square root of x.
atan2(y,x) Returns arctangent of y/x in the range -[pi] to [pi].
rand() Returns pseudo-random number r, where 0 <= r < 1.
2)Establishes new seed for rand(). If no seed is specified, uses time of day. Returns the old seed.
3)强制类型转换的函数: int()
eg: print (100/3) ##33.3333
print int(100/3) ##33
4)Random Number Generation
The rand() function generates a pseudo-random floating-point number between 0 and 1. The srand() function sets the seed or starting point for random number generation. If srand() is called without an argument, it uses the time of day to generate the seed. With an argument x, srand() uses x as the seed.
2.string function
gsub(r,s,t) Globally substitutes s for each match of the regular expression r in the string t. Returns the number of substitutions. If t is not supplied, defaults to $0.
index(s,t) Returns position of substring t in string s or zero if not present.
length(s) Returns length of string s or length of $0 if no string is supplied.
match(s,r) Returns either the position in s where the regular expression r begins, or 0 if no occurrences are found. Sets the values of RSTART and RLENGTH.
split(s,a,sep) Parses string s into elements of array a using field separator sep; returns number of elements. If sep is not supplied, FS is used. Array splitting works the same way as field splitting.
sprintf("fmt",expr) Uses printf format specification for expr.
sub(r,s,t) Substitutes s for first match of the regular expression r in the string t. Returns 1 if successful; 0 otherwise. If t is not supplied, defaults to $0.
substr(s,p,n) Returns substring of string s at beginning position p up to a maximum length of n. If n is not supplied, the rest of the string from p is used.
tolower(s) Translates all uppercase characters in string s to lowercase and returns the new string.
toupper(s) Translates all lowercase characters in string s to uppercase and returns the new string.
The split() function was introduced in the previous chapter in the discussion on arrays.
3. write your own function
1)格式:function name (parameter-list) {
function insert(STRING, POS, INS) {
before_tmp = substr(STRING, 1, POS)
after_tmp = substr(STRING, POS + 1)
return before_tmp INS after_tmp
2)Maintaining a Function Library
可以将有用的函数放到一个特定的目录, 当作函数库用。通过-f选项可以使用多个程序文件,例如:
$ awk -f grade.awk -f /usr/local/share/awk/sort.awk grades.test
四. 高级主题
1.getline() 函数
1 If it was able to read a line.
0 If it encounters the end-of-file.
-1 If it encounters an error.
# getline.awk -- test getline function
/^\.SH "?Name"?/ {
getline # get next line
print $1 # print $1 of new line.
实例:getline < "data" 从data文件中读入一行. 一般可以如下写:
while ( (getline < "data") > 0 )
BEGIN { printf "Enter your name: "
getline < "-"
getline input ##直接将读到的值得赋给变量input
实例1:awk '{"who am i" | getline me;print me}' -
awk '# getname - print users fullname from /etc/passwd
BEGIN { "who am i" | getline
name = $1
FS = ":"
name ~ $1 { print $5 }
' /etc/passwd
{ some processing of $0 | "sort > tmpfile" }
close("sort > tmpfile")
while ((getline < "tmpfile") > 0) {
do more work
# getFilename function -- prompts user for filename,
# verifies that file exists and returns absolute pathname.
function getFilename(file) {
while (! file) {
printf "Enter a filename: "
getline < "-" # get response
file = $0
# check that file exists and is readable
# test returns 1 if file does not exist.
if (system("test -r " file)) {
print file " not found"
file = ""
if (file !~ /^\//) {
"pwd" | getline # get current directory
file = $0 "/" file
return file
4. 定向标准输出到文件或者管道
例如:print > "data.out" 将当前记录写入文件data.out
print "a =", a, "b =", b, "max =", (a > b ? a : b) > "data.out"
eg: print |“wc -w” ##统计当前记录的单词个数
5. awk的常用限制
Item Limit
Number of fields per record 100
Characters per input record 3000
Characters per output record 3000
Characters per field 1024
Characters per printf string 3000
Characters in literal string 400
Characters in character class 400
Files open 15
Pipes open 1
$cat myscript.awk
#!/usr/bin/awk -f ##或者是其他/bin/awk -f
{ print $0 }
$./myscipt.awk test ##等价于awk -f myscript.awk test
