An AWK program consists of a sequence of optional directives,
pattern-action statements, and optional function definitions.
@include "
filename"
@load "
filename"
@namespace "
name"
pattern {
action statements }
function
name(
parameter list) {
statements }
Gawk first reads the program source from the program-file(s) if
specified, from arguments to --source
, or from the first non-
option argument on the command line. The -f
and --source
options
may be used multiple times on the command line. Gawk reads the
program text as if all the program-files and command line source
texts had been concatenated together. This is useful for
building libraries of AWK functions, without having to include
them in each new AWK program that uses them. It also provides
the ability to mix library functions with command line programs.
In addition, lines beginning with @include
may be used to include
other source files into your program, making library use even
easier. This is equivalent to using the --include
option.
Lines beginning with @load
may be used to load extension
functions into your program. This is equivalent to using the
--load
option.
The environment variable AWKPATH
specifies a search path to use
when finding source files named with the -f
and --include
options. If this variable does not exist, the default path is
".:/usr/local/share/awk"
. (The actual directory may vary,
depending upon how gawk was built and installed.) If a file name
given to the -f
option contains a '/' character, no path search
is performed.
The environment variable AWKLIBPATH
specifies a search path to
use when finding source files named with the --load
option. If
this variable does not exist, the default path is
"/usr/local/lib/gawk"
. (The actual directory may vary, depending
upon how gawk was built and installed.)
Gawk executes AWK programs in the following order. First, all
variable assignments specified via the -v
option are performed.
Next, gawk compiles the program into an internal form. Then,
gawk executes the code in the BEGIN
rule(s) (if any), and then
proceeds to read each file named in the ARGV
array (up to
ARGV[ARGC-1]
). If there are no files named on the command line,
gawk reads the standard input.
If a filename on the command line has the form var=
val it is
treated as a variable assignment. The variable var will be
assigned the value val. (This happens after any BEGIN
rule(s)
have been run.) Command line variable assignment is most useful
for dynamically assigning values to the variables AWK uses to
control how input is broken into fields and records. It is also
useful for controlling state if multiple passes are needed over a
single data file.
If the value of a particular element of ARGV
is empty (""
), gawk
skips over it.
For each input file, if a BEGINFILE
rule exists, gawk executes
the associated code before processing the contents of the file.
Similarly, gawk executes the code associated with ENDFILE
after
processing the file.
For each record in the input, gawk tests to see if it matches any
pattern in the AWK program. For each pattern that the record
matches, gawk executes the associated action. The patterns are
tested in the order they occur in the program.
Finally, after all the input is exhausted, gawk executes the code
in the END
rule(s) (if any).
Command Line Directories
According to POSIX, files named on the awk command line must be
text files. The behavior is ``undefined'' if they are not. Most
versions of awk treat a directory on the command line as a fatal
error.
Starting with version 4.0 of gawk, a directory on the command
line produces a warning, but is otherwise skipped. If either of
the --posix
or --traditional
options is given, then gawk reverts
to treating directories on the command line as a fatal error.