Путеводитель по Руководству Linux

  User  |  Syst  |  Libr  |  Device  |  Files  |  Other  |  Admin  |  Head  |



   groff_diff    ( 7 )

различия между GNU roff и AT&T troff (differences between GNU roff and AT&T troff)

  Name  |  Description  |  Language  |  Intermediate output format  |  Debugging  |    Implementation differences    |  See also  |

Implementation differences

groff has a number of features that cause incompatibilities with documents written using old versions of roff. Some GNU extensions to roff have become supported by other implementations.

When adjusting to both margins, AT&T troff at first adjusts spaces starting from the right; troff begins from the left. Both implementations adjust spaces from opposite ends on alternating output lines in this adjustment mode to prevent 'rivers' in the text.

groff does not always hyphenate words as AT&T troff does. The AT&T implementation uses a set of hard-coded rules specific to U.S. English, while groff uses language-specific hyphenation pattern files derived from TeX. Furthermore, in old versions of troff there was a limited amount of space to store hyphenation exceptions (arguments to the .hw request); groff has no such restriction.

Long names may be groff's most obvious innovation. AT&T troff interprets '.dsabcd' as defining a string 'ab' with contents 'cd'. Normally, groff interprets this as a call of a macro named 'dsabcd'. AT&T troff also interprets \*[ and \n[ as an interpolation of a string or number register, respectively, called '['. In groff, however, the '[' is normally interpreted as delimiting a long name. In compatibility mode, groff interprets names in the traditional way, which means that they are limited to one or two characters. See the -C option in groff(1) and, above, the .C and .cp registers, and .cp and .do requests, for more on compatibility mode.

The register \n[.cp] is specialized and may require a statement of rationale. When writing macro packages or documents that use groff features and which may be mixed with other packages or documents that do not—common scenarios include serial processing of man pages or use of the .so or .mso requests—you may desire correct operation regardless of compatibility mode in the surrounding context. It may occur to you to save the existing value of \n(.C into a register, say, _C, at the beginning of your file, turn compatibility mode off with '.cp 0', then restore it from that register at the end with '.cp \n(_C'. At the same time, a modular design of a document or macro package may lead you to multiple layers of inclusion. You cannot use the same register name everywhere or you risk 'clobbering' the value from a preceding or enclosing context. The two-character register name space of AT&T troff is confining and mnemonically challenging; you may wish to use groff's more capacious name space. However, attempting '.nr _my_saved_C \n(.C' will not work in compatibility mode; the register name is too long. 'This is exactly what .do is for,' you think, '.do nr _my_saved_C \n(.C'. The foregoing will always save zero to your register, because .do turns compatibility mode off while it interprets its argument list. What you need is: .do nr _my_saved_C \n[.cp] .cp 0 at the beginning of your file, followed by .cp \n[_my_saved_C] .do rr _my_saved_C at the end. As in the C language, we all have to share one big name space, so choose a register name that is unlikely to collide with other uses.

The existence of the .T string is a common feature of post- CSTR #54 troffs—DWB 3.3, Solaris, Heirloom Doctools, and Plan 9 troff all support it—but valid values are specific to each implementation. This behavior of the .T register differs from AT&T troff, which interpolated 1 only if nroff was the formatter and was called with -T.

AT&T troff and other implementations handle .lf differently. For them, its line argument changes the line number of the current line.

AT&T troff had only environments named '0', '1', and '2'. In GNU troff, any number of environments may exist, using any valid identifiers for their names.

Normally, groff preserves the interpolation depth in delimited arguments, but not in compatibility mode. For example, on terminal devices, .ds xx ' \w'abc\*(xxdef' produces '168' ordinarily, but '72def'' in compatibility mode.

Furthermore, the escapes \f, \H, \m, \M, \R, \s, and \S are transparent for the purpose of recognizing a control character at the beginning of a line only in compatibility mode. For example, this code produces bold output in both cases, but the text differs, .de xx ' Hello! .. \fB.xx\fP producing '.xx' in normal mode and 'Hello!' in compatibility mode.

groff does not allow the use of the escape sequences \|, \^, \&, \{, \}, '\ ', \', \`, \-, \_, \!, \%, \c, in names of strings, macros, diversions, number registers, fonts, or environments; AT&T troff does. The \A escape sequence (see subsection 'Escape sequences' above) may be helpful in avoiding use of these escape sequences in names.

Normally, the syntax form \sn accepts only a single character (a digit) for n, consistently with other forms that originated in AT&T troff, like \*, \$, \f, \g, \k, \n, and \z. In compatibility mode only, a non-zero n must be in the range 4–39. Legacy documents relying upon this quirk of parsing should be migrated to another \s form. [Background: The Graphic Systems C/A/T phototypesetter (the original device target for AT&T troff) supported only a few discrete point sizes in the range 6–36, so Ossanna contrived a special case in the parser to do what the user must have meant. Kernighan warned of this in the 1992 revision of CSTR #54 (§2.3), and more recently, McIlroy referred to it as a 'living fossil'.]

Fractional point sizes cause one noteworthy incompatibility. In AT&T troff the .ps request ignores scaling indicators and thus '.ps 10u' sets the point size to 10 points, whereas in groff it sets the point size to 10 scaled points. See subsection 'Fractional point sizes and new scaling indicators' above.

The .bp request differs from AT&T troff: GNU troff does not accept a scaling indicator on the argument, a page number; the former (somewhat uselessly) does.

In AT&T troff the .pm request reports macro, string, and diversion sizes in units of 128-byte blocks, and an argument reduces the report to a sum of the above in the same units. groff ignores any arguments and reports the sizes in bytes.

Unlike AT&T troff, groff does not ignore the .ss request if the output is a terminal device; instead, the values of minimal inter-word and additional inter-sentence spacing are rounded down to the nearest multiple of 12.

In groff, there is a fundamental difference between unformatted input characters, and formatted output characters (glyphs). Everything that affects how a glyph is output is stored with the glyph; once a glyph has been constructed, it is unaffected by any subsequent requests that are executed, including the .bd, .cs, .tkf, .tr, or .fp requests. Normally, glyphs are constructed from input characters immediately before the glyph is added to the current output line. Macros, diversions, and strings are all, in fact, the same type of object; they contain lists of input characters and glyphs in any combination. Special characters can be both: before being added to the output, they act as input entities; afterwards, they denote glyphs. A glyph does not behave like an input character for the purposes of macro processing; it does not inherit any of the special properties that the input character from which it was constructed might have had. Consider the following example. .di x \\\\ .br .di .x It prints '\\' in groff; each pair of input backslashes is turned into one output backslash and the resulting output backslashes are not interpreted as escape characters when they are reread. AT&T troff would interpret them as escape characters when they were reread and would end up printing one '\'.

One correct way to obtain a printable backslash in most documents is to use the \e escape sequence; this always prints a single instance of the current escape character, regardless of whether or not it is used in a diversion; it also works in both groff and AT&T troff. (Naturally, if you've changed the escape character, you need to prefix the 'e' with whatever it is—and you'll likely get something other than a backslash in the output.)

The other correct way, appropriate in contexts independent of the backslash's common use as a roff escape character—perhaps in discussion of character sets or other programming languages—is the character escape \(rs or \[rs], for 'reverse solidus', from its name in the ECMA-6 (ISO/IEC 646) standard. [This character escape is not portable to AT&T troff, but is to its lineal descendant, Heirloom Doctools troff, as of its 060716 release (July 2006).]

To store an escape sequence in a diversion that is interpreted when the diversion is reread, either use the traditional \! transparent output facility, or, if this is unsuitable, the new \? escape sequence. See subsection 'Escape sequences' above and sections 'Diversions' and 'Gtroff Internals' in Groff: The GNU Implementation of troff, the groff Texinfo manual.

In the somewhat pathological case where a diversion exists containing a partially collected line and a partially collected line at the top-level diversion has never existed, AT&T troff will output the partially collected line at the end of input; groff will not.

Intermediate output format Its extensions notwithstanding, the groff intermediate output format has some incompatibilities with that of AT&T troff, but full compatibility is sought; problem reports and patches are welcome. The following incompatibilities are known.

• The positioning after drawing polygons conflicts with the AT&T troff practice.

• The intermediate output cannot be rescaled to other devices as AT&T troff's could.