Путеводитель по Руководству Linux

  User  |  Syst  |  Libr  |  Device  |  Files  |  Other  |  Admin  |  Head  |



   sort.1p    ( 1 )

сортировать, объединять или проверять последовательность текстовых файлов (sort, merge, or sequence check text files)

Обоснование (Rationale)

Examples in some historical documentation state that options -um with one input file keep the first in each set of lines with equal keys. This behavior was deemed to be an implementation artifact and was not standardized.

The -z option was omitted; it is not standard practice on most systems and is inconsistent with using sort to sort several files individually and then merge them together. The text concerning -z in historical documentation appeared to require implementations to determine the proper buffer length during the sort phase of operation, but not during the merge.

The -y option was omitted because of non-portability. The -M option, present in System V, was omitted because of non- portability in international usage.

An undocumented -T option exists in some implementations. It is used to specify a directory for intermediate files. Implementations are encouraged to support the use of the TMPDIR environment variable instead of adding an option to support this functionality.

The -k option was added to satisfy two objections. First, the zero-based counting used by sort is not consistent with other utility conventions. Second, it did not meet syntax guideline requirements.

Historical documentation indicates that ``setting -n implies -b''. The description of -n already states that optional leading <blank>s are tolerated in doing the comparison. If -b is enabled, rather than implied, by -n, this has unusual side-effects. When a character offset is used in a column of numbers (for example, to sort modulo 100), that offset is measured relative to the most significant digit, not to the column. Based upon a recommendation from the author of the original sort utility, the -b implication has been omitted from this volume of POSIX.1‐2017, and an application wishing to achieve the previously mentioned side-effects has to code the -b flag explicitly.

Earlier versions of this standard allowed the -o option to appear after operands. Historical practice allowed all options to be interspersed with operands. This version of the standard allows implementations to accept options after operands but conforming applications should not use this form.

Earlier versions of this standard also allowed the -number and +number options. These options are no longer specified by POSIX.1‐2008 but may be present in some implementations.

Historical implementations produced a message on standard error when -c was specified and disorder was detected, and when -c and -u were specified and a duplicate key was detected. An earlier version of this standard contained wording that did not make it clear that this message was allowed and some implementations removed this message to be sure that they conformed to the standard's requirements. Confronted with this difference in behavior, interactive users that wanted to be sure that they got visual feedback instead of just exit code 1 could have used a command like:

sort -c file || echo disorder

whether or not the sort utility provided a message in this case. But, it was not easy for a user to find where the disorder or duplicate key occurred on implementations that do not produce a message, especially when some parts of the input line were not part of the key and when one or more of the -b, -d, -f, -i, -n, or -r options or keydef type modifiers were in use. POSIX.1‐2008 requires a message to be produced in this case. POSIX.1‐2008 also contains the -C option giving users the ability to choose either behavior.

When a disorder or duplicate is found when the -c option is specified, some implementations print a message containing the first line that is out of order or contains a duplicate key; others print a message specifying the line number of the offending line. This standard allows either type of message.

Implementations are encouraged to perform the recommended further byte-by-byte comparison of lines that collate equally, even though this may affect efficiency. The impact on efficiency can be mitigated by only performing the additional comparison if the current locale's collating sequence does not have a total ordering of all characters (if the implementation provides a way to query this) or by only performing the additional comparison if the locale name associated with the LC_COLLATE category has an '@' modifier in the name (since locales without an '@' modifier should have a total ordering of all characters — see the Base Definitions volume of POSIX.1‐2017, Section 7.3.2, LC_COLLATE). Note that if the implementation provides a stable sort option as an extension (usually -s), the additional comparison should not be performed when this option has been specified.