Путеводитель по Руководству Linux

  User  |  Syst  |  Libr  |  Device  |  Files  |  Other  |  Admin  |  Head  |



   diff.1p    ( 1 )

сравнить два файла (compare two files)

Обоснование (Rationale)

The -h option was omitted because it was insufficiently specified
       and does not add to applications portability.

Historical implementations employ algorithms that do not always produce a minimum list of differences; the current language about making every effort is the best this volume of POSIX.1‐2017 can do, as there is no metric that could be employed to judge the quality of implementations against any and all file contents. The statement ``This list should be minimal'' clearly implies that implementations are not expected to provide the following output when comparing two 100-line files that differ in only one character on a single line:

1,100c1,100 all 100 lines from file1 preceded with "< " --- all 100 lines from file2 preceded with "> "

The ``Only in'' messages required when the -r option is specified are not used by most historical implementations if the -e option is also specified. It is required here because it provides useful information that must be provided to update a target directory hierarchy to match a source hierarchy. The ``Common subdirectories'' messages are written by System V and 4.3 BSD when the -r option is specified. They are allowed here but are not required because they are reporting on something that is the same, not reporting a difference, and are not needed to update a target hierarchy.

The -c option, which writes output in a format using lines of context, has been included. The format is useful for a variety of reasons, among them being much improved readability and the ability to understand difference changes when the target file has line numbers that differ from another similar, but slightly different, copy. The patch utility is most valuable when working with difference listings using a context format. The BSD version of -c takes an optional argument specifying the amount of context. Rather than overloading -c and breaking the Utility Syntax Guidelines for diff, the standard developers decided to add a separate option for specifying a context diff with a specified amount of context (-C). Also, the format for context diffs was extended slightly in 4.3 BSD to allow multiple changes that are within context lines from each other to be merged together. The output format contains an additional four <asterisk> characters after the range of affected lines in the first filename. This was to provide a flag for old programs (like old versions of patch) that only understand the old context format. The version of context described here does not require that multiple changes within context lines be merged, but it does not prohibit it either. The extension is upwards-compatible, so any vendors that wish to retain the old version of diff can do so by adding the extra four <asterisk> characters (that is, utilities that currently use diff and understand the new merged format will also understand the old unmerged format, but not vice versa).

The -u and -U options of GNU diff have been included. Their output format, designed by Wayne Davison, takes up less space than -c and -C format, and in many cases is easier to read. The format's timestamps do not vary by locale, so LC_TIME does not affect it. The format's line numbers are rendered with the %1d format, not %d, because the file format notation rules would allow extra <blank> characters to appear around the numbers.

The substitute command was added as an additional format for the -e option. This was added to provide implementations with a way to fix the classic ``dot alone on a line'' bug present in many versions of diff. Since many implementations have fixed this bug, the standard developers decided not to standardize broken behavior, but rather to provide the necessary tool for fixing the bug. One way to fix this bug is to output two periods whenever a lone period is needed, then terminate the append command with a period, and then use the substitute command to convert the two periods into one period.

The BSD-derived -r option was added to provide a mechanism for using diff to compare two file system trees. This behavior is useful, is standard practice on all BSD-derived systems, and is not easily reproducible with the find utility.

The requirement that diff not compare files in some circumstances, even though they have the same name, is based on the actual output of historical implementations. The specified behavior precludes the problems arising from running into FIFOs and other files that would cause diff to hang waiting for input with no indication to the user that diff was hung. An earlier version of this standard specified the output format more precisely, but in practice this requirement was widely ignored and the benefit of standardization seemed small, so it is now unspecified. In most common usage, diff -r should indicate differences in the file hierarchies, not the difference of contents of devices pointed to by the hierarchies.

Many early implementations of diff require seekable files. Since the System Interfaces volume of POSIX.1‐2017 supports named pipes, the standard developers decided that such a restriction was unreasonable. Note also that the allowed filename - almost always refers to a pipe.

No directory search order is specified for diff. The historical ordering is, in fact, not optimal, in that it prints out all of the differences at the current level, including the statements about all common subdirectories before recursing into those subdirectories.

The message:

"diff %s %s %s\n", <diff_options>, <filename1>, <filename2>

does not vary by locale because it is the representation of a command, not an English sentence.