The command uniq searches a file for sequences of identical lines, and writes the file to standard output, removing all but one of repeated lines in the process. Note that repeated lines must be adjacent in order to be found, i.e. the input file must be sorted.
Syntax
Format 1: | uniq[ -c| -d| -u][ -n][ +m][ input_file[ ouput_file]] |
Format 2: | uniq[ -c| -d| -u][ -f feld][ -s zeichen][ input_file[ ouput_file]] |
The two formats are defined together since option -n in format 1 is equivalent to the option -f field in format 2 and option +m in format 1 is equivalent to option -s char in format 2. No option specified The named input_file is output without repeated lines.
Outputs all lines without repetitions, starting each line with a decimal number to indicate how often it occurred repeatedly in input_file. uniq ignores the -u and -d options if set with the -c option.
Outputs one copy each of only those lines that are repeated in input_file.
Outputs only the lines that are not repeated in input_file.
Ignores the first n fields from the beginning of the line, plus any tabs or blanks located in front of a field, when comparing for duplicates. A field is a string of non-blank characters separated from its neighbors by tabs or blanks. -n not specified: Option -n is equivalent to option -f
Causes the first m characters from the beginning of the line to be ignored when comparing for duplicates. If the +m option is combined with the -n option, the first m characters after the nth field are ignored. Blanks following the nth field are not ignored: they must be allowed for in the value of m. +m not specified: Option +m is equivalent to option -s
Name of the file that is to be examined. input_file not specified:
Name of the file to which the output is to be written. output_file not specified: |
Locale
The following environment variables affect the execution of uniq: LANG Provide a default value for the internationalization variables that are unset or null. If LANG is unset of null, the corresponding value from the implementation-specific default locale will be used. If any of the internationalization variables contains an invalid setting, the utility will behave as if none of the variables had been defined. LC_ALL If set to a non-empty string value, override the values of all the other internationalization variables. LC_CTYPE Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single- as opposed to multi-byte characters in arguments and input files), the classification of characters as upper- to lower-case, and the mapping of characters from one case to the other. LC_MESSAGES Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error. NLSPATH Determine the location of message catalogs for the processing of LC_MESSAGES. |
Example 1
You want to search a file for identical lines, regardless of where they are located in the file. A count showing how often each of these lines occurs is also to be output.
|
Example 2
You want to output the 10 most frequently occurring words in the file text.
Explanation:
|
See also
comm, sort |