DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(gawk.info) Simple Sed

Info Catalog (gawk.info) Extract Program (gawk.info) Miscellaneous Programs (gawk.info) Igawk Program
 
 A Simple Stream Editor
 ----------------------
 
    The `sed' utility is a "stream editor," a program that reads a
 stream of data, makes changes to it, and passes the modified data on.
 It is often used to make global changes to a large file, or to a stream
 of data generated by a pipeline of commands.
 
    While `sed' is a complicated program in its own right, its most
 common use is to perform global substitutions in the middle of a
 pipeline:
 
      command1 < orig.data | sed 's/old/new/g' | command2 > result
 
    Here, the `s/old/new/g' tells `sed' to look for the regexp `old' on
 each input line, and replace it with the text `new', globally (i.e. all
 the occurrences on a line).  This is similar to `awk''s `gsub' function
 ( Built-in Functions for String Manipulation String Functions.).
 
    The following program, `awksed.awk', accepts at least two command
 line arguments; the pattern to look for and the text to replace it
 with. Any additional arguments are treated as data file names to
 process. If none are provided, the standard input is used.
 
      # awksed.awk --- do s/foo/bar/g using just print
      #    Thanks to Michael Brennan for the idea
      
      # Arnold Robbins, arnold@gnu.org, Public Domain
      # August 1995
      
      function usage()
      {
          print "usage: awksed pat repl [files...]" > "/dev/stderr"
          exit 1
      }
      
      BEGIN {
          # validate arguments
          if (ARGC < 3)
              usage()
      
          RS = ARGV[1]
          ORS = ARGV[2]
      
          # don't use arguments as files
          ARGV[1] = ARGV[2] = ""
      }
      
      # look ma, no hands!
      {
          if (RT == "")
              printf "%s", $0
          else
              print
      }
 
    The program relies on `gawk''s ability to have `RS' be a regexp and
 on the setting of `RT' to the actual text that terminated the record
 ( How Input is Split into Records Records.).
 
    The idea is to have `RS' be the pattern to look for. `gawk' will
 automatically set `$0' to the text between matches of the pattern.
 This is text that we wish to keep, unmodified.  Then, by setting `ORS'
 to the replacement text, a simple `print' statement will output the
 text we wish to keep, followed by the replacement text.
 
    There is one wrinkle to this scheme, which is what to do if the last
 record doesn't end with text that matches `RS'?  Using a `print'
 statement unconditionally prints the replacement text, which is not
 correct.
 
    However, if the file did not end in text that matches `RS', `RT'
 will be set to the null string.  In this case, we can print `$0' using
 `printf' ( Using `printf' Statements for Fancier Printing
 Printf.).
 
    The `BEGIN' rule handles the setup, checking for the right number of
 arguments, and calling `usage' if there is a problem. Then it sets `RS'
 and `ORS' from the command line arguments, and sets `ARGV[1]' and
 `ARGV[2]' to the null string, so that they will not be treated as file
 names ( Using `ARGC' and `ARGV' ARGC and ARGV.).
 
    The `usage' function prints an error message and exits.
 
    Finally, the single rule handles the printing scheme outlined above,
 using `print' or `printf' as appropriate, depending upon the value of
 `RT'.
 
Info Catalog (gawk.info) Extract Program (gawk.info) Miscellaneous Programs (gawk.info) Igawk Program
automatically generated byinfo2html