файлы дампа в различных форматах (dump files in various formats)
Обоснование (Rationale)
The od utility went through several names in early proposals,
including hd, xd, and most recently hexdump. There were several
objections to all of these based on the following reasons:
* The hd and xd names conflicted with historical utilities that
behaved differently.
* The hexdump description was much more complex than needed for
a simple dump utility.
* The od utility has been available on all historical
implementations and there was no need to create a new name
for a utility so similar to the historical od utility.
The original reasons for not standardizing historical od were
also fairly widespread. Those reasons are given below along with
rationale explaining why the standard developers believe that
this version does not suffer from the indicated problem:
* The BSD and System V versions of od have diverged, and the
intersection of features provided by both does not meet the
needs of the user community. In fact, the System V version
only provides a mechanism for dumping octal bytes and short
s,
signed and unsigned decimal short
s, hexadecimal short
s, and
ASCII characters. BSD added the ability to dump float
s,
double
s, named ASCII characters, and octal, signed decimal,
unsigned decimal, and hexadecimal long
s. The version
presented here provides more normalized forms for dumping
bytes, short
s, int
s, and long
s in octal, signed decimal,
unsigned decimal, and hexadecimal; float
, double
, and long
double
; and named ASCII as well as current locale characters.
* It would not be possible to come up with a compatible
superset of the BSD and System V flags that met the
requirements of the standard developers. The historical
default od output is the specified default output of this
utility. None of the option letters chosen for this version
of od conflict with any of the options to historical versions
of od.
* On systems with different sizes for short
, int
, and long
,
there was no way to ask for dumps of int
s, even in the BSD
version. Because of the way options are named, the name space
could not be extended to solve these problems. This is why
the -t
option was added (with type specifiers more closely
matched to the printf() formats used in the rest of this
volume of POSIX.1‐2017) and the optional field sizes were
added to the d
, f
, o
, u
, and x
type specifiers. It is also
one of the reasons why the historical practice was not
mandated as a required obsolescent form of od. (Although the
old versions of od are not listed as an obsolescent form,
implementations are urged to continue to recognize the older
forms for several more years.) The a
, c
, f
, o
, and x
types
match the meaning of the corresponding format characters in
the historical implementations of od except for the default
sizes of the fields converted. The d
format is signed in this
volume of POSIX.1‐2017 to match the printf() notation.
(Historical versions of od used d
as a synonym for u
in this
version. The System V implementation uses s
for signed
decimal; BSD uses i
for signed decimal and s
for null-
terminated strings.) Other than d
and u
, all of the type
specifiers match format characters in the historical BSD
version of od
.
The sizes of the C-language types char
, short
, int
, long
,
float
, double
, and long double
are used even though it is
recognized that there may be zero or more than one compiler
for the C language on an implementation and that they may use
different sizes for some of these types. (For example, one
compiler might use 2 bytes short
s, 2 bytes int
s, and 4 bytes
long
s, while another compiler (or an option to the same
compiler) uses 2 bytes short
s, 4 bytes int
s, and 4 bytes
long
s.) Nonetheless, there has to be a basic size known by
the implementation for these types, corresponding to the
values reported by invocations of the getconf utility when
called with system_var operands {UCHAR_MAX}, {USHORT_MAX},
{UINT_MAX}, and {ULONG_MAX} for the types char
, short
, int
,
and long
, respectively. There are similar constants required
by the ISO C standard, but not required by the System
Interfaces volume of POSIX.1‐2017 or this volume of
POSIX.1‐2017. They are {FLT_MANT_DIG}, {DBL_MANT_DIG}, and
{LDBL_MANT_DIG} for the types float
, double
, and long double
,
respectively. If the optional c99 utility is provided by the
implementation and used as specified by this volume of
POSIX.1‐2017, these are the sizes that would be provided. If
an option is used that specifies different sizes for these
types, there is no guarantee that the od utility is able to
interpret binary data output by such a program correctly.
This volume of POSIX.1‐2017 requires that the numeric values
of these lengths be recognized by the od utility and that
symbolic forms also be recognized. Thus, a conforming
application can always look at an array of unsigned long
data
elements using od -t
uL.
* The method of specifying the format for the address field
based on specifying a starting offset in a file unnecessarily
tied the two together. The -A
option now specifies the
address base and the -S
option specifies a starting offset.
* It would be difficult to break the dependence on US ASCII to
achieve an internationalized utility. It does not seem to be
any harder for od to dump characters in the current locale
than it is for the ed or sed l
commands. The c
type specifier
does this without difficulty and is completely compatible
with the historical implementations of the c
format character
when the current locale uses a superset of the
ISO/IEC 646:1991 standard as a codeset. The a
type specifier
(from the BSD a
format character) was left as a portable
means to dump ASCII (or more correctly ISO/IEC 646:1991
standard (IRV)) so that headers produced by pax could be
deciphered even on systems that do not use the
ISO/IEC 646:1991 standard as a subset of their base codeset.
The use of "**"
as an indication of continuation of a multi-byte
character in c
specifier output was chosen based on seeing an
implementation that uses this method. The continuation bytes have
to be marked in a way that is not ambiguous with another single-
byte or multi-byte character.
An early proposal used -S
and -n
, respectively, for the -j
and -N
options eventually selected. These were changed to avoid
conflicts with historical implementations.
The original standard specified -t o2
as the default when no
output type was given. This was changed to -t oS
(the length of a
short
) to accommodate a supercomputer implementation that
historically used 64 bits as its default (and that defined shorts
as 64 bits). This change should not affect conforming
applications. The requirement to support lengths of 1, 2, and 4
was added at the same time to address an historical
implementation that had no two-byte data types in its C compiler.
The use of a basic integer data type is intended to allow the
implementation to choose a word size commonly used by
applications on that architecture.
Earlier versions of this standard allowed for implementations
with bytes other than eight bits, but this has been modified in
this version.