Your Browser is not longer supported

Please use Google Chrome, Mozilla Firefox or Microsoft Edge to view the page correctly
Loading...

{{viewport.spaceProperty.prod}}

tr - translate characters

&pagelevel(4)&pagelevel

tr reads an input text from standard input, replaces (Format 1 and 2) or deletes (Format 3 and 4) selected characters from it, and writes the result to standard output.


Syntax


Format 1: tr[ -cs] string1 string2
Format 2: tr -s[ -c] string1
Format 3: tr -d[ -c] string1
Format 4: tr -ds[ -c] string1 string2

If you specify more than one option in the command line for either of these formats, these options can be preceded by a single minus sign with no intervening blanks (e.g. -cs or -dc).

Replace characters
Format 1: tr[ -cs] string1 string2
Format 2: tr -s[ -c] string1


tr reads the input text, replacing characters that appear in string1 with the corresponding characters in string2. In other words, the nth character in string1 is replaced in the input text by the nth character in string2. If string2 contains fewer characters than string1, those characters in string1 which have no corresponding character in string2 are not replaced (see Example 1).


-c

Complements string1 with respect to the currently applicable character set (octal values 001 through 377). The complemented string1 then then contains all characters of the currently applicable character set except for those specified in the original string1.

tr then replaces the nth character of the input text in the complemented string1 with the nth character in string2.

-s

(squeeze)

After replacement, tr reduces all strings of repeated output characters in string2 to a single character (see Example 2).

string1 [string2]

In string1 you specify the characters to be replaced; string2 provides the replacement string.

The characters in both strings must be specified without intervening blanks or other delimiters.

If a string contains metacharacters that have a special meaning for the shell, these metacharacters must be escaped by enclosing the entire string in single quotes ’...’ or by preceding each such character with a backslash \.

The strings can contain the following specifications:

character

any printable character.

\octal_number

whereby octal_number is a one, two, or three-digit octal number. The backslash must be escaped so that the digits are recognized as an octal number.

tr also processes the NUL character (000 in octal).
Warning: In previous versions, the NUL character was always deleted as an input character.

metacharacters

as escape sequences (same as for the printf command). You can enter the following escape sequences:

\\Backslash (for distinguishing octal numbers)
\aWarning, alert *)
\bBackspace *)
\fForm Feed
\nNew Line
\rCarriage Return
\tTab
\vVertical tab *)

*) These metacharacters are supported only on character-mode terminals (i.e. if you are accessing the POSIX shell via rlogin)

a-z or [a-z]

Stands for the set of characters from a to z inclusive. Characters are sorted in the currently applicable collating sequence. Unlike in internationalized regular expressions, a and z must be ordinary characters, i.e. not equivalence class expressions [=c=] or collating symbols [.cc.].

In the current collating sequence (see LC_COLLATE), the character used for a must precede the character used for z.

"b-a" is an illegal range and is rejected.

In the notation without brackets, "a-a" stands for "a" and so on, but "---" is interpreted as three "-" characters. In the notation with brackets, "[a-a]" and "[---]" lead to undefined results.

[:class:]

Depending on the locale, "a-c" means "abc" or "aäbc", "n-p" means "nop" or "noöp" or "nöop", "t-v" means "tuv" or "tuüv" or "tüuv", and "r-t" can mean "rst", "rsßt" or "rßst".

class specifies a character class, similar to internationalized regular expressions. The following values are possible for class:

alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit

Character classes must not be specified in a replacement string. Exception: The classes lower and upper are permitted if the corresponding character class is specified on the same side in string1.

[=equivalence=]

equivalence specifies an equivalence class, similar to internationalized regular expressions.

Equivalence classes must not be specified in a replacement string.

[a*n]

Stands for n repetitions of the a, e.g. [a*3] stands for aaa.Only useful in a replacement string.

If the first digit of n is 0, n is considered octal; otherwise, it is taken to be decimal.If n is 0 or is omitted, it is taken to be "huge", meaning that the preceding character is to be repeated as often as required to pad string2 to the length of string1 (see Example 1).

string2 not specified:
string1 (possibly complemented, see -c) is used for string2.

string1 and string2 not specified:
string1 is the null string. Either the null string (without option -c) or the entire current character set (with option -c) is taken for string2.

Delete characters
Format 3: tr -d[ -c] string1
Format 4: tr -ds[ -c] string1 string2


This format is only useful when string1 is specified. If the -s option is not set, string2 will be ignored.


-d

(d - delete)

Deletes all input characters that appear in string1.

If the -s option is not specified, string2 is ignored.

-c

Complements string1 with respect to the current character set. The complemented string1 then contains all characters of the current character set except for those specified in the original string1.

tr then deletes all input characters which occur in the complemented string1.

-s

(squeeze)

tr reduces all strings of repeated output characters in string2 to a single character. The -s option is meaningless if string2 is not specified.

string1 [string2]

In string1 you specify the characters to be deleted; string2 contains the characters that are to be reduced to a single character if they appear two or more times in succession in the output (see option -s).

Details with respect to how these strings are to be specified are given with Format 1 on "tr translate characters".

string1 and string2 not specified:
If the -c option is not specified, all input characters are copied unaltered to standard output. If option -c is specified, tr deletes all input characters, i.e. prints nothing on standard output.

Locale

The following environment variables affect the execution of tr:

LANG

Provide a default value for the internationalization variables that are unset or null. If LANG is unset of null, the corresponding value from the implementation-specific default locale will be used. If any of the internationalization variables contains an invalid setting, the utility will behave as if none of the variables had been defined.

LC_ALL

If set to a non-empty string value, override the values of all the other internationalization variables.

LC_COLLATE

Determine the locale for the behavior of range expressions and equivalence classes (eg. [a-z]).

LC_CTYPE

Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single- as opposed to multi-byte characters in arguments) and the behavior of character classes. LC_CTYPE also specifies which characters are included in the currently valid character set in conjunction with the -c option.

LC_MESSAGES

Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error.

NLSPATH

Determine the location of message catalogs for the processing of LC_MESSAGES.

Hint

Ranges of uppercase and lowercase letters cannot always be unambiguously mapped to each other in locales which contain the ’ß’ character (see above under "Replace characters"). The ranges “a-z” and “A-Z” are exceptions here and are treated separately as of this POSIX version.

Nevertheless, do not try to convert lowercase letters to uppercase using the command below. The result is not defined in a locale other than the POSIX or C locale and can vary in older POSIX versions or on other platforms:

$ tr 'a-z' 'A-Z' <file

Instead, use the following character classes to convert lowercase letters to uppercase (or vice versa):

$ tr '[:lower:]' '[:upper:]' <file

$ tr '[:upper:]' '[:lower:]' <file

However, LC_CTYPE and LC_COLLATE must refer to the same locale for conversion to work correctly. You can ensure that they do by assigning LANG or LC_ALL.

Example 1

tr without options: simple examples to demonstrate how tr works:

$ cat days
Monday Tuesday Wednesday Thursday Friday Saturday Sunday

$ tr TS ts <days

Monday tuesday Wednesday thursday Friday saturday sunday

Here tr replaces all occurrences of T with t and of S with s.

$ tr TSF ts <days

Monday tuesday Wednesday thursday Friday saturday sunday

The second string has fewer characters than the first in this case. tr replaces T by t and S by s, but leaves F unaltered.

Now let us try replacing all lowercase letters with x. The following solution is not suitable:

$ tr '[a-z]' x <days

Mondxy Tuesdxy Wednesdxy Thursdxy Fridxy Sxturdxy Sundxy


tr has clearly only replaced occurrences of a with x. To replace all lowercase letters, we need to call tr as follows:

$ tr '[a-z]' '[x*]' <days

Mxxxxx Txxxxxx Wxxxxxxxx Txxxxxxx Fxxxxx Sxxxxxxx Sxxxxx


Each character in string 1 now has a corresponding x in string 2, as the asterisk (*) causes string 2 to be padded with x as often as required. The single quotes are essential, since the strings include shell metacharacters.

Example 2

We want to create a list of all the words that appear in textfile, one word to a line, a word being defined as any consecutive string consisting only of letters.

$ cat textfile

"When shall we three meet again?

In thunder, lightning, or in rain?"

$ tr -cs '[A-Z][a-z]' '[\025*]' <textfile

When

shall

we

three

meet

again

In

thunder

lightning

or

in

rain

Example 3

Deleting a non-printing character from a file (tr -d)

$ tr -d '\016' <file

tr deletes from the file the character with the octal code 016 and displays the result on the standard output.

See also

ed, sh, sed