The awk program specified in the command line is most easily
specified within single-quotes (for example, 'program') for
applications using sh, because awk programs commonly contain
characters that are special to the shell, including double-
quotes. In the cases where an awk program contains single-quote
characters, it is usually easiest to specify most of the program
as strings within single-quotes concatenated by the shell with
quoted single-quote characters. For example:
awk '/'\''/ { print "quote:", $0 }'
prints all lines from the standard input containing a single-
quote character, prefixed with quote:.
The following are examples of simple awk programs:
1. Write to the standard output all input lines for which field
3 is greater than 5:
$3 > 5
2. Write every tenth line:
(NR % 10) == 0
3. Write any line with a substring matching the regular
expression:
/(G|D)(2[0-9][[:alpha:]]*)/
4. Print any line with a substring containing a 'G'
or 'D'
,
followed by a sequence of digits and characters. This example
uses character classes digit
and alpha
to match language-
independent digit and alphabetic characters respectively:
/(G|D)([[:digit:][:alpha:]]*)/
5. Write any line in which the second field matches the regular
expression and the fourth field does not:
$2 ~ /xyz/ && $4 !~ /xyz/
6. Write any line in which the second field contains a
<backslash>:
$2 ~ /\\/
7. Write any line in which the second field contains a
<backslash>. Note that <backslash>-escapes are interpreted
twice; once in lexical processing of the string and once in
processing the regular expression:
$2 ~ "\\\\"
8. Write the second to the last and the last field in each line.
Separate the fields by a <colon>:
{OFS=":";print $(NF-1), $NF}
9. Write the line number and number of fields in each line. The
three strings representing the line number, the <colon>, and
the number of fields are concatenated and that string is
written to standard output:
{print NR ":" NF}
10. Write lines longer than 72 characters:
length($0) > 72
11. Write the first two fields in opposite order separated by
OFS
:
{ print $2, $1 }
12. Same, with input fields separated by a <comma> or <space> and
<tab> characters, or both:
BEGIN { FS = ",[ \t]*|[ \t]+" }
{ print $2, $1 }
13. Add up the first column, print sum, and average:
{s += $1 }
END {print "sum is ", s, " average is", s/NR}
14. Write fields in reverse order, one per line (many lines out
for each line in):
{ for (i = NF; i > 0; --i) print $i }
15. Write all lines between occurrences of the strings start
and
stop
:
/start/, /stop/
16. Write all lines whose first field is different from the
previous one:
$1 != prev { print; prev = $1 }
17. Simulate echo:
BEGIN {
for (i = 1; i < ARGC; ++i)
printf("%s%s", ARGV[i], i==ARGC-1?"\n":" ")
}
18. Write the path prefixes contained in the PATH environment
variable, one per line:
BEGIN {
n = split (ENVIRON["PATH"], path, ":")
for (i = 1; i <= n; ++i)
print path[i]
}
19. If there is a file named input
containing page headers of the
form: Page #
and a file named program
that contains:
/Page/ { $2 = n++; }
{ print }
then the command line:
awk -f program n=5 input
prints the file input
, filling in page numbers starting at 5.