сравнить два файла (compare two files)
Стандартный вывод (Stdout)
Diff Directory Comparison Format
If both file1 and file2 are directories, the following output
formats shall be used.
In the POSIX locale, each file that is present in only one
directory shall be reported using the following format:
"Only in %s: %s\n", <directory pathname>, <filename>
In the POSIX locale, subdirectories that are common to the two
directories may be reported with the following format:
"Common subdirectories: %s and %s\n", <directory1 pathname>,
<directory2 pathname>
For each file common to the two directories, if the two files are
not to be compared: if the two files have the same device ID and
file serial number, or are both block special files that refer to
the same device, or are both character special files that refer
to the same device, in the POSIX locale the output format is
unspecified. Otherwise, in the POSIX locale an unspecified
format shall be used that contains the pathnames of the two
files.
For each file common to the two directories, if the files are
compared and are identical, no output shall be written. If the
two files differ, the following format is written:
"diff %s %s %s\n", <diff_options>, <filename1>, <filename2>
where <diff_options> are the options as specified on the command
line.
All directory pathnames listed in this section shall be relative
to the original command line arguments. All other names of files
listed in this section shall be filenames (pathname components).
Diff Binary Output Format
In the POSIX locale, if one or both of the files being compared
are not text files, it is implementation-defined whether diff
uses the binary file output format or the other formats as
specified below. The binary file output format shall contain the
pathnames of two files being compared and the string "differ"
.
If both files being compared are text files, depending on the
options specified, one of the following formats shall be used to
write the differences.
Diff Default Output Format
The default (without -e
, -f
, -c
, -C
, -u
, or -U
options) diff
utility output shall contain lines of these forms:
"%da%d\n", <num1>, <num2>
"%da%d,%d\n", <num1>, <num2>, <num3>
"%dd%d\n", <num1>, <num2>
"%d,%dd%d\n", <num1>, <num2>, <num3>
"%dc%d\n", <num1>, <num2>
"%d,%dc%d\n", <num1>, <num2>, <num3>
"%dc%d,%d\n", <num1>, <num2>, <num3>
"%d,%dc%d,%d\n", <num1>, <num2>, <num3>, <num4>
These lines resemble ed subcommands to convert file1 into file2.
The line numbers before the action letters shall pertain to
file1; those after shall pertain to file2. Thus, by exchanging a
for d and reading the line in reverse order, one can also
determine how to convert file2 into file1. As in ed, identical
pairs (where num1= num2) are abbreviated as a single number.
Following each of these lines, diff shall write to standard
output all lines affected in the first file using the format:
"< %s", <line>
and all lines affected in the second file using the format:
"> %s", <line>
If there are lines affected in both file1 and file2 (as with the
c
subcommand), the changes are separated with a line consisting
of three <hyphen-minus> characters:
"---\n"
Diff -e Output Format
With the -e
option, a script shall be produced that shall, when
provided as input to ed, along with an appended w
(write)
command, convert file1 into file2. Only the a
(append), c
(change), d
(delete), i
(insert), and s
(substitute) commands of
ed shall be used in this script. Text lines, except those
consisting of the single character <period> ('.'
), shall be
output as they appear in the file.
Diff -f Output Format
With the -f
option, an alternative format of script shall be
produced. It is similar to that produced by -e
, with the
following differences:
1. It is expressed in reverse sequence; the output of -e
orders
changes from the end of the file to the beginning; the -f
from beginning to end.
2. The command form <lines> <command-letter> used by -e
is
reversed. For example, 10c with -e
would be c10 with -f
.
3. The form used for ranges of line numbers is
<space>-separated, rather than <comma>-separated.
Diff -c or -C Output Format
With the -c
or -C
option, the output format shall consist of
affected lines along with surrounding lines of context. The
affected lines shall show which ones need to be deleted or
changed in file1, and those added from file2. With the -c
option, three lines of context, if available, shall be written
before and after the affected lines. With the -C
option, the user
can specify how many lines of context are written. The exact
format follows.
The name and last modification time of each file shall be output
in the following format:
"*** %s %s\n", file1, <file1 timestamp>
"--- %s %s\n", file2, <file2 timestamp>
Each <file> field shall be the pathname of the corresponding file
being compared. The pathname written for standard input is
unspecified.
In the POSIX locale, each <timestamp> field shall be equivalent
to the output from the following command:
date "+%a %b %e %T %Y"
without the trailing <newline>, executed at the time of last
modification of the corresponding file (or the current time, if
the file is standard input).
Then, the following output formats shall be applied for every set
of changes.
First, a line shall be written in the following format:
"***************\n"
Next, the range of lines in file1 shall be written in the
following format if the range contains two or more lines:
"*** %d,%d ****\n", <beginning line number>, <ending line number>
and the following format otherwise:
"*** %d ****\n", <ending line number>
The ending line number of an empty range shall be the number of
the preceding line, or 0 if the range is at the start of the
file.
Next, the affected lines along with lines of context (unaffected
lines) shall be written. Unaffected lines shall be written in the
following format:
" %s", <unaffected_line>
Deleted lines shall be written as:
"- %s", <deleted_line>
Changed lines shall be written as:
"! %s", <changed_line>
Next, the range of lines in file2 shall be written in the
following format if the range contains two or more lines:
"--- %d,%d ----\n", <beginning line number>, <ending line number>
and the following format otherwise:
"--- %d ----\n", <ending line number>
Then, lines of context and changed lines shall be written as
described in the previous formats. Lines added from file2 shall
be written in the following format:
"+ %s", <added_line>
Diff -u or -U Output Format
The -u
or -U
options behave like the -c
or -C
options, except
that the context lines are not repeated; instead, the context,
deleted, and added lines are shown together, interleaved. The
exact format follows.
The name and last modification time of each file shall be output
in the following format:
"--- %s\t%s%s %s\n", file1, <file1 timestamp>, <file1 frac>, <file1 zone>
"+++ %s\t%s%s %s\n", file2, <file2 timestamp>, <file2 frac>, <file2 zone>
Each <file> field shall be the pathname of the corresponding file
being compared, or the single character '-'
if standard input is
being compared. However, if the pathname contains a <tab> or a
<newline>, or if it does not consist entirely of characters taken
from the portable character set, the behavior is implementation-
defined.
Each <timestamp> field shall be equivalent to the output from the
following command:
date '+%Y-%m-%d %H:%M:%S'
without the trailing <newline>, executed at the time of last
modification of the corresponding file (or the current time, if
the file is standard input).
Each <frac> field shall be either empty, or a decimal point
followed by at least one decimal digit, indicating the
fractional-seconds part (if any) of the file timestamp. The
number of fractional digits shall be at least the number needed
to represent the file's timestamp without loss of information.
Each <zone> field shall be of the form "shhmm"
, where "shh"
is a
signed two-digit decimal number in the range -24 through +25, and
"mm"
is an unsigned two-digit decimal number in the range 00
through 59. It represents the timezone of the timestamp as the
number of hours (hh) and minutes (mm) east (+) or west (-) of UTC
for the timestamp. If the hours and minutes are both zero, the
sign shall be '+'
. However, if the timezone is not an integral
number of minutes away from UTC, the <zone> field is
implementation-defined.
Then, the following output formats shall be applied for every set
of changes.
First, the range of lines in each file shall be written in the
following format:
"@@ -%s +%s @@", <file1 range>, <file2 range>
Each <range> field shall be of the form:
"%1d", <beginning line number>
or:
"%1d,1", <beginning line number>
if the range contains exactly one line, and:
"%1d,%1d", <beginning line number>, <number of lines>
otherwise. If a range is empty, its beginning line number shall
be the number of the line just before the range, or 0 if the
empty range starts the file.
Next, the affected lines along with lines of context shall be
written. Each non-empty unaffected line shall be written in the
following format:
" %s", <unaffected_line>
where the contents of the unaffected line shall be taken from
file1. It is implementation-defined whether an empty unaffected
line is written as an empty line or a line containing a single
<space> character. This line also represents the same line of
file2, even though file2's line may contain different contents
due to the -b
. Deleted lines shall be written as:
"-%s", <deleted_line>
Added lines shall be written as:
"+%s", <added_line>
The order of lines written shall be the same as that of the
corresponding file. A deleted line shall never be written
immediately after an added line.
If -U
n is specified, the output shall contain no more than 2n
consecutive unaffected lines; and if the output contains an
affected line and this line is adjacent to up to n consecutive
unaffected lines in the corresponding file, the output shall
contain these unaffected lines. -u
shall act like -U
3.