(gawk.info) Simple Sed
Info Catalog
(gawk.info) Extract Program
(gawk.info) Miscellaneous Programs
(gawk.info) Igawk Program
A Simple Stream Editor
----------------------
The `sed' utility is a "stream editor," a program that reads a
stream of data, makes changes to it, and passes the modified data on.
It is often used to make global changes to a large file, or to a stream
of data generated by a pipeline of commands.
While `sed' is a complicated program in its own right, its most
common use is to perform global substitutions in the middle of a
pipeline:
command1 < orig.data | sed 's/old/new/g' | command2 > result
Here, the `s/old/new/g' tells `sed' to look for the regexp `old' on
each input line, and replace it with the text `new', globally (i.e. all
the occurrences on a line). This is similar to `awk''s `gsub' function
( Built-in Functions for String Manipulation String Functions.).
The following program, `awksed.awk', accepts at least two command
line arguments; the pattern to look for and the text to replace it
with. Any additional arguments are treated as data file names to
process. If none are provided, the standard input is used.
# awksed.awk --- do s/foo/bar/g using just print
# Thanks to Michael Brennan for the idea
# Arnold Robbins, arnold@gnu.org, Public Domain
# August 1995
function usage()
{
print "usage: awksed pat repl [files...]" > "/dev/stderr"
exit 1
}
BEGIN {
# validate arguments
if (ARGC < 3)
usage()
RS = ARGV[1]
ORS = ARGV[2]
# don't use arguments as files
ARGV[1] = ARGV[2] = ""
}
# look ma, no hands!
{
if (RT == "")
printf "%s", $0
else
print
}
The program relies on `gawk''s ability to have `RS' be a regexp and
on the setting of `RT' to the actual text that terminated the record
( How Input is Split into Records Records.).
The idea is to have `RS' be the pattern to look for. `gawk' will
automatically set `$0' to the text between matches of the pattern.
This is text that we wish to keep, unmodified. Then, by setting `ORS'
to the replacement text, a simple `print' statement will output the
text we wish to keep, followed by the replacement text.
There is one wrinkle to this scheme, which is what to do if the last
record doesn't end with text that matches `RS'? Using a `print'
statement unconditionally prints the replacement text, which is not
correct.
However, if the file did not end in text that matches `RS', `RT'
will be set to the null string. In this case, we can print `$0' using
`printf' ( Using `printf' Statements for Fancier Printing
Printf.).
The `BEGIN' rule handles the setup, checking for the right number of
arguments, and calling `usage' if there is a problem. Then it sets `RS'
and `ORS' from the command line arguments, and sets `ARGV[1]' and
`ARGV[2]' to the null string, so that they will not be treated as file
names ( Using `ARGC' and `ARGV' ARGC and ARGV.).
The `usage' function prints an error message and exits.
Finally, the single rule handles the printing scheme outlined above,
using `print' or `printf' as appropriate, depending upon the value of
`RT'.
Info Catalog
(gawk.info) Extract Program
(gawk.info) Miscellaneous Programs
(gawk.info) Igawk Program
automatically generated byinfo2html