* There is an important diference between ‘RS = ""’ and ‘RS = "\n\n+"’. In the frst case,leading newlines in the input data fle are ignored, and if a fle ends without extra blank lines after the last record, the fnal newline is removed from the record. In the second case, this special processing is not done.
* According to the POSIX standard, awk is supposed to behave as if each record is split into felds at the time it is read. In particular, this means that if you change the value of FS after a record is read, the value of the felds (i.e., how they were split) should refect the old value of FS, not the new one.
*Decrementing NF throws away the values of the felds after the new value of NF and
recomputes $0. Note, however, that merely referencing an out-of-range feld does not change the value of either $0 or NF. Referencing an out-of-range feld only produces an empty string.
*When FS == " ", Fields are separated by runs of whitespace. Leading and trailing whitespace are ignored. This is the default.
--------------------------------------------------------------
RS == "\n"
Records are separated by the newline character (‘\n’). In efect, every line in
the data fle is a separate record, including blank lines. This is the default.
RS == any single character
Records are separated by each occurrence of the character. Multiple successive
occurrences delimit empty records.
RS == "" Records are separated by runs of blank lines. When FS is a single character, thenthe newline character always serves as a feld separator, in addition to whatever value FS may have. Leading and trailing newlines in a fle are ignored.
RS == regexp
Records are separated by occurrences of characters that match regexp. Leading and trailing matches of regexp delimit empty records. (This is a gawk extension; it is not specifed by the POSIX standard.)
阅读(2421) | 评论(0) | 转发(0) |