DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(gawk) Fields

Info Catalog (gawk) Records (gawk) Reading Files (gawk) Non-Constant Fields
 
 Examining Fields
 ================
 
    When `awk' reads an input record, the record is automatically
 separated or "parsed" by the interpreter into chunks called "fields".
 By default, fields are separated by whitespace, like words in a line.
 Whitespace in `awk' means any string of one or more spaces, tabs or
 newlines;(1) other characters such as formfeed, and so on, that are
 considered whitespace by other languages are _not_ considered
 whitespace by `awk'.
 
    The purpose of fields is to make it more convenient for you to refer
 to these pieces of the record.  You don't have to use them--you can
 operate on the whole record if you wish--but fields are what make
 simple `awk' programs so powerful.
 
    To refer to a field in an `awk' program, you use a dollar-sign, `$',
 followed by the number of the field you want.  Thus, `$1' refers to the
 first field, `$2' to the second, and so on.  For example, suppose the
 following is a line of input:
 
      This seems like a pretty nice example.
 
 Here the first field, or `$1', is `This'; the second field, or `$2', is
 `seems'; and so on.  Note that the last field, `$7', is `example.'.
 Because there is no space between the `e' and the `.', the period is
 considered part of the seventh field.
 
    `NF' is a built-in variable whose value is the number of fields in
 the current record.  `awk' updates the value of `NF' automatically,
 each time a record is read.
 
    No matter how many fields there are, the last field in a record can
 be represented by `$NF'.  So, in the example above, `$NF' would be the
 same as `$7', which is `example.'.  Why this works is explained below
 ( Non-constant Field Numbers Non-Constant Fields.).  If you try
 to reference a field beyond the last one, such as `$8' when the record
 has only seven fields, you get the empty string.
 
    `$0', which looks like a reference to the "zeroth" field, is a
 special case: it represents the whole input record.  `$0' is used when
 you are not interested in fields.
 
    Here are some more examples:
 
      $ awk '$1 ~ /foo/ { print $0 }' BBS-list
      -| fooey        555-1234     2400/1200/300     B
      -| foot         555-6699     1200/300          B
      -| macfoo       555-6480     1200/300          A
      -| sabafoo      555-2127     1200/300          C
 
 This example prints each record in the file `BBS-list' whose first
 field contains the string `foo'.  The operator `~' is called a
 "matching operator" ( How to Use Regular Expressions Regexp
 Usage.); it tests whether a string (here, the field `$1') matches a
 given regular expression.
 
    By contrast, the following example looks for `foo' in _the entire
 record_ and prints the first field and the last field for each input
 record containing a match.
 
      $ awk '/foo/ { print $1, $NF }' BBS-list
      -| fooey B
      -| foot B
      -| macfoo A
      -| sabafoo C
 
    ---------- Footnotes ----------
 
    (1) In POSIX `awk', newlines are not considered whitespace for
 separating fields.
 
Info Catalog (gawk) Records (gawk) Reading Files (gawk) Non-Constant Fields
automatically generated byinfo2html