концепции и история верстки roff (concepts and history of roff typesetting)
История (History)
Computer-driven document formatting dates back to the 1960s. The
roff system itself is intimately connected with the Unix
operating system, but its roots go back to the earlier operating
systems CTSS and Multics.
The predecessor—
RUNOFF
roff's ancestor RUNOFF was written in the MAD language by Jerry
Saltzer to prepare his Ph.D. thesis using the Compatible Time
Sharing System (CTSS), a project of the Massachusetts Institute
of Technology (MIT). The program is generally referred to in
full capitals, both to distinguish it from its many descendants,
and because bits were expensive in those days; five- and six-bit
character encodings were still in widespread usage, and mixed-
case alphabetics seen as a luxury. RUNOFF introduced a syntax of
inlining formatting directives amid document text, by beginning a
line with a period (an unlikely occurrence in human-readable
material) followed by a 'control word'. Control words with
obvious meaning like '.line length n' were supported as well as
an abbreviation system; the latter came to overwhelm the former
in popular usage and later derivatives of the program. A sample
of control words from a RUNOFF manual of December 1966
⟨http://web.mit.edu/Saltzer/www/publications/ctss/AH.9.01.html⟩
was documented as follows with only a slight update to parameter
syntax. They will be familiar to roff veterans.
Abbreviation Control word
.ad
.adjust
.bp
.begin page
.br
.break
.ce
.center
.in
.indent n
.ll
.line length n
.nf
.nofill
.pl
.paper length n
.sp
.space [n]
In 1965, MIT's Project MAC teamed with Bell Telephone
Laboratories and General Electric (GE) to inaugurate the Multics
⟨http://www.multicians.org⟩ project. After a few years, Bell
Labs discontinued its participation in Multics, famously
prompting the development of Unix. Meanwhile, Saltzer's RUNOFF
proved influential, seeing many ports and derivations elsewhere.
In 1969, Doug McIlroy wrote one such reimplementation of RUNOFF
in the BCPL language for a GE 645 running GECOS at the Bell Labs
location in Murray Hill, New Jersey. In its manual, the control
commands were termed 'requests', their two-letter names were
canonical, and the control character was configurable with a .cc
request. Other familiar requests emerged at this time; no-adjust
(.na
), need (.ne
), page offset (.po
), tab configuration (.ta
,
though it worked differently), temporary indent (.ti
), character
translation (.tr
), and automatic underlining (.ul
; on RUNOFF you
had to backspace and underscore in the input yourself). .fi
to
enable filling of output lines got the name it retains to this
day.
Unix and
roff
By 1971, McIlroy's runoff had been rewritten in DEC PDP-11
assembly language by Dennis Ritchie for the fledgling Unix
operating system and seen its name shortened to roff (perhaps
under the influence of Ken Thompson), but had added support for
automatic hyphenation with .hc
and .hy
requests; a generalization
of line spacing control with the .ls
request; and what later
roffs would call diversions, with 'footnote' requests. This roff
indirectly funded operating systems research at Murray Hill, for
it was used to prepare patent applications for AT&T to the U.S.
government. This arrangement enabled the group to acquire the
aforementioned PDP-11; roff promptly proved equal to the task of
typesetting the first edition of the manual for what would later
become known as 'v1 Unix', dated November 1971.
Output from all of the foregoing programs was limited to line
printers and paper terminals such the IBM 2471 (based on the
Selectric line of typewriters) and the Teletype Corporation Model
37. Proportionally-spaced type was unknown.
New
roff and Typesetter
roff
The first years of Unix were spent in rapid evolution. The
practicalities of preparing standardized documents like patent
applications (and Unix manual pages), combined with McIlroy's
enthusiasm for macro languages, perhaps created an irresistible
pressure to make roff extensible. Joe Ossanna's nroff, literally
a 'new roff', was the outlet for this pressure. By the time of
Version 3 Unix (February 1973)—and still in PDP-11 assembly
language—it sported a swath of features now considered essential
to roff systems; definition of macros (.de
), diversion of text
thence (.di
), and removal thereof (.rm
); trap planting (.wh
;
'when') and relocation (.ch
; 'change'); conditional processing
(.if
); and environments (.ev
). Incremental improvements included
assignment of the next page number (.pn
); no-space mode (.ns
) and
restoration of vertical spacing (.rs
); the saving (.sv
) and
output (.os
) of vertical space; specification of replacement
characters for tabs (.tc
) and leaders (.lc
); configuration of the
no-break control character (.c2
); shorthand to disable automatic
hyphenation (.nh
); a condensation of what were formerly six
different requests for configuration of page 'titles' (headers
and footers) into one (.tl
) with a length controlled separately
from the line length (.lt
); automatic line numbering (.nm
);
interactive input (.rd
), which necessitated buffer-flushing
(.fl
), and was made convenient with early program cessation
(.ex
); source file inclusion in its modern form (.so
; though
RUNOFF had an '.append' control word for a similar purpose) and
early advance to the next file argument (.nx
); ignorable content
(.ig
); and programmable abort (.ab
).
Third Edition Unix also brought the pipe(2) system call, the
explosive growth of a componentized system based around it, and a
'filter model' that remains perceptible today. Around this time,
Michael Lesk developed the tbl preprocessor for formatting
tables. Equally importantly, the Bell Labs site in Murray Hill
acquired a Graphic Systems C/A/T phototypesetter, and with it
came the necessity of expanding the capabilities of a roff system
to cope with proportionally-spaced type, multiple point sizes,
and a variety of fonts. Ossanna wrote a parallel implementation
of nroff for the C/A/T, dubbing it troff (for 'typesetter roff').
Unfortunately, surviving documentation does not illustrate what
requests were implemented at this time for C/A/T support; the
troff(1) man page in Fourth Edition Unix (November 1973) does not
feature a request list, unlike nroff(1). Apart from typesetter-
driven features, Version 4 Unix roffs added string definitions
(.ds
); made the escape character configurable (.ec
); and enabled
the user to write diagnostics to the standard error stream (.tm
).
Around 1974, empowered with multiple type sizes, italics, and a
symbol font specially commissioned by Bell Labs from Graphic
Systems, Brian Kernighan and Lorinda Cherry implemented eqn for
typesetting mathematics. In the same year, for Fifth Edition
Unix, Ossanna combined and reimplemented the two roffs in C,
using preprocessor conditions of that language to generate both
from a single source tree.
Ossanna documented the syntax of the input language to the nroff
and troff programs in the 'Troff User's Manual', first published
in 1976, with further revisions as late as 1992 by Kernighan.
(The original version was entitled 'Nroff/Troff User's Manual',
which may partially explain why roff practitioners have tended to
refer to it by its AT&T document identifier, 'CSTR #54'.) Its
final revision serves as the de facto specification of AT&T
troff, and all subsequent implementors of roff systems have done
so in its shadow.
A small and simple set of roff macros was first used for the
manual pages of Version 4 Unix and persisted for two further
releases, but the first macro package to be formally described
and installed was ms by Lesk in Version 6. He also wrote a
manual, 'Typing Documents on the Unix System', describing ms and
basic nroff/troff usage, updating it as the package accrued
features.
For Version 7 Unix (January 1979), McIlroy designed, implemented,
and documented the man macro package, introducing most of the
macros described in groff_man(7) today, and edited volume 1 of
the Version 7 manual using it. Documents composed using ms
featured in volume 2, edited by Kernighan.
Ossanna had passed away unexpectedly in 1977, and after the
release of Version 7, with the C/A/T typesetter becoming
supplanted by alternative devices, Kernighan undertook a revision
and rewrite of troff to generalize its design. To implement this
revised architecture, he developed the font and device
description file formats and the device-independent output format
that remain in use today. He described these novelties in the
article 'A Typesetter-independent TROFF', last revised in 1982,
and like the troff manual itself, it is widely known by a
shorthand, 'CSTR #97'.
Kernighan's innovations prepared troff well for the introduction
of the Adobe PostScript language in 1982 and a vibrant market in
laser printers with built-in interpreters for it. An output
driver for PostScript, dpost, was swiftly developed. However,
due to AT&T software licensing practices, Ossanna's troff, with
its tight coupling to the capabilities of the C/A/T, remained in
parallel distribution with device-independent troff throughout
the 1980s, leading some developers to contrive translators for
C/A/T-formatted documents to other devices. An example was
vtroff for Versatec and Benson-Varian plotters. Today, however,
all actively maintained troffs follow Kernighan's device-
independent design.
groff—a free
roff from GNU
The most important free roff project historically has been groff,
the GNU implementation of troff, developed from scratch by James
Clark starting in 1989 and distributed under copyleft
⟨http://www.gnu.org/copyleft⟩ licenses, ensuring to all the
availability of source code and the freedom to modify and
redistribute it, properties unprecedented in roff systems to that
point. groff rapidly attracted contributors, and has served as a
complete replacement for almost all applications of AT&T troff
(exceptions include mv, a macro package for preparation of
viewgraphs and slides, and the ideal preprocessor for producing
diagrams from a constraint-based language). Beyond that, it has
added numerous features; see groff_diff(7). Since its inception
and for at least the following three decades, it has been used by
practically all GNU/Linux and BSD operating systems.
groff continues to be developed, is available for almost all
operating systems in common use (along with several obscure
ones), and it is free. These factors make groff the de facto
roff standard today.
Heirloom Doctools
troff
An alternative is Gunnar Ritter's Heirloom roff project
⟨https://github.com/n-t-roff/heirloom-doctools⟩ project, started
in 2005, which provides enhanced versions of the various roff
tools found in the OpenSolaris and Plan 9 operating systems, now
available under free licenses. You can get this package with the
shell command:
$ git clone https://github.com/n-t-roff/heirloom-doctools
Moreover, one finds there the Original Documenter's Workbench
Release 3.3 ⟨https://github.com/n-t-roff/DWB3.3⟩.