tr reads an input text from standard input, replaces (Format 1 and 2) or deletes (Format 3 and 4) selected characters from it, and writes the result to standard output.
Syntax
Format 1: |
tr[ -cs] string1 string2 |
Format 2: |
tr -s[ -c] string1 |
Format 3: |
tr -d[ -c] string1 |
Format 4: |
tr -ds[ -c] string1 string2 |
If you specify more than one option in the command line for either of these formats, these options can be preceded by a single minus sign with no intervening blanks (e.g. -cs or -dc). Replace characters
tr reads the input text, replacing characters that appear in string1 with the corresponding characters in string2. In other words, the nth character in string1 is replaced in the input text by the nth character in string2. If string2 contains fewer characters than string1, those characters in string1 which have no corresponding character in string2 are not replaced (see Example 1).
Complements string1 with respect to the currently applicable character set (octal values 001 through 377). The complemented string1 then then contains all characters of the currently applicable character set except for those specified in the original string1. tr then replaces the nth character of the input text in the complemented string1 with the nth character in string2.
(squeeze) After replacement, tr reduces all strings of repeated output characters in string2 to a single character (see Example 2).
In string1 you specify the characters to be replaced; string2 provides the replacement string. The characters in both strings must be specified without intervening blanks or other delimiters. If a string contains metacharacters that have a special meaning for the shell, these metacharacters must be escaped by enclosing the entire string in single quotes ’...’ or by preceding each such character with a backslash \. The strings can contain the following specifications:
any printable character.
whereby octal_number is a one, two, or three-digit octal number. The backslash must be escaped so that the digits are recognized as an octal number. tr also processes the NUL character (000 in octal).
as escape sequences (same as for the printf command). You can enter the following escape sequences:
*) These metacharacters are supported only on character-mode terminals (i.e. if you are accessing the POSIX shell via rlogin)
Stands for the set of characters from a to z inclusive. Characters are sorted in the currently applicable collating sequence. Unlike in internationalized regular expressions, a and z must be ordinary characters, i.e. not equivalence class expressions [=c=] or collating symbols [.cc.]. In the current collating sequence (see LC_COLLATE), the character used for a must precede the character used for z. "b-a" is an illegal range and is rejected. In the notation without brackets, "a-a" stands for "a" and so on, but "---" is interpreted as three "-" characters. In the notation with brackets, "[a-a]" and "[---]" lead to undefined results.
Depending on the locale, "a-c" means "abc" or "aäbc", "n-p" means "nop" or "noöp" or "nöop", "t-v" means "tuv" or "tuüv" or "tüuv", and "r-t" can mean "rst", "rsßt" or "rßst". class specifies a character class, similar to internationalized regular expressions. The following values are possible for class:
Character classes must not be specified in a replacement string. Exception: The classes lower and upper are permitted if the corresponding character class is specified on the same side in string1.
equivalence specifies an equivalence class, similar to internationalized regular expressions. Equivalence classes must not be specified in a replacement string.
Stands for n repetitions of the a, e.g. [a*3] stands for aaa.Only useful in a replacement string. If the first digit of n is 0, n is considered octal; otherwise, it is taken to be decimal.If n is 0 or is omitted, it is taken to be "huge", meaning that the preceding character is to be repeated as often as required to pad string2 to the length of string1 (see Example 1). string2 not specified: string1 and string2 not specified: |
Delete characters
This format is only useful when string1 is specified. If the -s option is not set, string2 will be ignored.
(d - delete) Deletes all input characters that appear in string1. If the -s option is not specified, string2 is ignored.
Complements string1 with respect to the current character set. The complemented string1 then contains all characters of the current character set except for those specified in the original string1. tr then deletes all input characters which occur in the complemented string1.
(squeeze) tr reduces all strings of repeated output characters in string2 to a single character. The -s option is meaningless if string2 is not specified.
In string1 you specify the characters to be deleted; string2 contains the characters that are to be reduced to a single character if they appear two or more times in succession in the output (see option -s). Details with respect to how these strings are to be specified are given with Format 1 on "tr translate characters". string1 and string2 not specified: |
Locale
The following environment variables affect the execution of tr: LANG Provide a default value for the internationalization variables that are unset or null. If LANG is unset of null, the corresponding value from the implementation-specific default locale will be used. If any of the internationalization variables contains an invalid setting, the utility will behave as if none of the variables had been defined. LC_ALL If set to a non-empty string value, override the values of all the other internationalization variables. LC_COLLATE Determine the locale for the behavior of range expressions and equivalence classes (eg. [a-z]). LC_CTYPE Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single- as opposed to multi-byte characters in arguments) and the behavior of character classes. LC_CTYPE also specifies which characters are included in the currently valid character set in conjunction with the -c option. LC_MESSAGES Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error. NLSPATH Determine the location of message catalogs for the processing of LC_MESSAGES. |
Hint
Ranges of uppercase and lowercase letters cannot always be unambiguously mapped to each other in locales which contain the ’ß’ character (see above under "Replace characters"). The ranges “a-z” and “A-Z” are exceptions here and are treated separately as of this POSIX version. Nevertheless, do not try to convert lowercase letters to uppercase using the command below. The result is not defined in a locale other than the POSIX or C locale and can vary in older POSIX versions or on other platforms:
Instead, use the following character classes to convert lowercase letters to uppercase (or vice versa):
However, LC_CTYPE and LC_COLLATE must refer to the same locale for conversion to work correctly. You can ensure that they do by assigning LANG or LC_ALL. |
Example 1
tr without options: simple examples to demonstrate how tr works:
Here tr replaces all occurrences of T with t and of S with s.
The second string has fewer characters than the first in this case. tr replaces T by t and S by s, but leaves F unaltered. Now let us try replacing all lowercase letters with x. The following solution is not suitable:
tr has clearly only replaced occurrences of a with x. To replace all lowercase letters, we need to call tr as follows:
Each character in string 1 now has a corresponding x in string 2, as the asterisk (*) causes string 2 to be padded with x as often as required. The single quotes are essential, since the strings include shell metacharacters. |
Example 2
We want to create a list of all the words that appear in textfile, one word to a line, a word being defined as any consecutive string consisting only of letters.
|
Example 3
Deleting a non-printing character from a file (tr -d)
tr deletes from the file the character with the octal code 016 and displays the result on the standard output. |
See also
ed, sh, sed |