редактор потока (stream editor)
Обоснование (Rationale)
This volume of POSIX.1‐2017 requires implementations to support
at least ten distinct wfiles, matching historical practice on
many implementations. Implementations are encouraged to support
more, but conforming applications should not exceed this limit.
The exit status codes specified here are different from those in
System V. System V returns 2 for garbled sed commands, but
returns zero with its usage message or if the input file could
not be opened. The standard developers considered this to be a
bug.
The manner in which the l
command writes non-printable characters
was changed to avoid the historical backspace-overstrike method,
and other requirements to achieve unambiguous output were added.
See the RATIONALE for ed(1p) for details of the format chosen,
which is the same as that chosen for sed.
This volume of POSIX.1‐2017 requires implementations to provide
pattern and hold spaces of at least 8192 bytes, larger than the
4000 bytes spaces used by some historical implementations, but
less than the 20480 bytes limit used in an early proposal.
Implementations are encouraged to allocate dynamically larger
pattern and hold spaces as needed.
The requirements for acceptance of <blank> and <space> characters
in command lines has been made more explicit than in early
proposals to describe clearly the historical practice and to
remove confusion about the phrase ``protect initial blanks [sic]
and tabs from the stripping that is done on every script line''
that appears in much of the historical documentation of the sed
utility description of text. (Not all implementations are known
to have stripped <blank> characters from text lines, although
they all have allowed leading <blank> characters preceding the
address on a command line.)
The treatment of '#'
comments differs from the SVID which only
allows a comment as the first line of the script, but matches
BSD-derived implementations. The comment character is treated as
a command, and it has the same properties in terms of being
accepted with leading <blank> characters; the BSD implementation
has historically supported this.
Early proposals required that a script_file have at least one
non-comment line. Some historical implementations have behaved in
unexpected ways if this were not the case. The standard
developers considered that this was incorrect behavior and that
application developers should not have to avoid this feature. A
correct implementation of this volume of POSIX.1‐2017 shall
permit script_files that consist only of comment lines.
Early proposals indicated that if -e
and -f
options were
intermixed, all -e
options were processed before any -f
options.
This has been changed to process them in the order presented
because it matches historical practice and is more intuitive.
The treatment of the p
flag to the s
command differs between
System V and BSD-based systems when the default output is
suppressed. In the two examples:
echo a | sed 's/a/A/p'
echo a | sed -n 's/a/A/p'
this volume of POSIX.1‐2017, BSD, System V documentation, and the
SVID indicate that the first example should write two lines with
A
, whereas the second should write one. Some System V systems
write the A
only once in both examples because the p
flag is
ignored if the -n
option is not specified.
This is a case of a diametrical difference between systems that
could not be reconciled through the compromise of declaring the
behavior to be unspecified. The SVID/BSD/System V documentation
behavior was adopted for this volume of POSIX.1‐2017 because:
* No known documentation for any historic system describes the
interaction between the p
flag and the -n
option.
* The selected behavior is more correct as there is no
technical justification for any interaction between the p
flag and the -n
option. A relationship between -n
and the p
flag might imply that they are only used together, but this
ignores valid scripts that interrupt the cyclical nature of
the processing through the use of the D
, d
, q
, or branching
commands. Such scripts rely on the p
suffix to write the
pattern space because they do not make use of the default
output at the ``bottom'' of the script.
* Because the -n
option makes the p
flag unnecessary, any
interaction would only be useful if sed scripts were written
to run both with and without the -n
option. This is believed
to be unlikely. It is even more unlikely that programmers
have coded the p
flag expecting it to be unnecessary. Because
the interaction was not documented, the likelihood of a
programmer discovering the interaction and depending on it is
further decreased.
* Finally, scripts that break under the specified behavior
produce too much output instead of too little, which is
easier to diagnose and correct.
The form of the substitute command that uses the n
suffix was
limited to the first 512 matches in an early proposal. This limit
has been removed because there is no reason an editor processing
lines of {LINE_MAX} length should have this restriction. The
command s/a/A/2047
should be able to substitute the 2047th
occurrence of a
on a line.
The b
, t
, and :
commands are documented to ignore leading white
space, but no mention is made of trailing white space. Historical
implementations of sed assigned different locations to the labels
'x'
and "x "
. This is not useful, and leads to subtle
programming errors, but it is historical practice, and changing
it could theoretically break working scripts. Implementors are
encouraged to provide warning messages about labels that are
never referenced by a b
or t
command, jumps to labels that do not
exist, and label arguments that are subject to truncation.
Earlier versions of this standard allowed for implementations
with bytes other than eight bits, but this has been modified in
this version.