промежуточный выходной формат GNU roff (GNU roff intermediate output format)
Language concepts
During the run of troff
, the roff input is cracked down to the
information on what has to be printed at what position on the
intended device. So the language of the intermediate output
format can be quite small. Its only elements are commands with
or without arguments. In this document, the term 'command'
always refers to the intermediate output language, never to the
roff language used for document formatting. There are commands
for positioning and text writing, for drawing, and for device
controlling.
Separation
Classical troff output had strange requirements on whitespace.
The groff
output parser, however, is smart about whitespace by
making it maximally optional. The whitespace characters, i.e.,
the tab, space, and newline characters, always have a syntactical
meaning. They are never printable because spacing within the
output is always done by positioning commands.
Any sequence of space or tab characters is treated as a single
syntactical space. It separates commands and arguments, but is
only required when there would occur a clashing between the
command code and the arguments without the space. Most often,
this happens when variable length command names, arguments,
argument lists, or command clusters meet. Commands and arguments
with a known, fixed length need not be separated by syntactical
space.
A line break is a syntactical element, too. Every command
argument can be followed by whitespace, a comment, or a newline
character. Thus a syntactical line break is defined to consist
of optional syntactical space that is optionally followed by a
comment, and a newline character.
The normal commands, those for positioning and text, consist of a
single letter taking a fixed number of arguments. For historical
reasons, the parser allows stacking of such commands on the same
line, but fortunately, in groff intermediate output, every
command with at least one argument is followed by a line break,
thus providing excellent readability.
The other commands — those for drawing and device controlling —
have a more complicated structure; some recognize long command
names, and some take a variable number of arguments. So all D
and x
commands were designed to request a syntactical line break
after their last argument. Only one command, 'x X
' has an
argument that can stretch over several lines, all other commands
must have all of their arguments on the same line as the command,
i.e., the arguments may not be split by a line break.
Empty lines, i.e., lines containing only space and/or a comment,
can occur everywhere. They are just ignored.
Argument units
Some commands take integer arguments that are assumed to
represent values in a measurement unit, but the letter for the
corresponding scaling indicator is not written with the output
command arguments; see groff(7) and Groff: The GNU Implementation
of troff, the groff Texinfo manual, for more on this topic. Most
commands assume the scaling indicator 'u
', the basic unit of the
device, some use 'z
', the scaled point unit of the device, while
others, such as the color commands, expect plain integers. Note
that these scaling indicators are relative to the chosen device.
They are defined by the parameters specified in the device's DESC
file; see groff_font(5).
Note that single characters can have the eighth bit set, as can
the names of fonts and special characters (this is, glyphs). The
names of glyphs and fonts can be of arbitrary length. A glyph
that is to be printed will always be in the current font.
A string argument is always terminated by the next whitespace
character (space, tab, or newline); an embedded #
character is
regarded as part of the argument, not as the beginning of a
comment command. An integer argument is already terminated by
the next non-digit character, which then is regarded as the first
character of the next argument or command.
Document parts
A correct intermediate output document consists of two parts, the
prologue and the body.
The task of the prologue is to set the general device parameters
using three exactly specified commands. The groff prologue is
guaranteed to consist of the following three lines (in that
order):
x T
device
x res
n h v
x init
with the arguments set as outlined in subsection 'Device Control
Commands' below. However, the parser for the intermediate output
format is able to swallow additional whitespace and comments as
well.
The body is the main section for processing the document data.
Syntactically, it is a sequence of any commands different from
the ones used in the prologue. Processing is terminated as soon
as the first x stop
command is encountered; the last line of any
groff intermediate output always contains such a command.
Semantically, the body is page oriented. A new page is started
by a p
command. Positioning, writing, and drawing commands are
always done within the current page, so they cannot occur before
the first p
command. Absolute positioning (by the H
and
V
commands) is done relative to the current page, all other
positioning is done relative to the current location within this
page.