POSIX V11.0 (en)

&pagelevel(3)&pagelevel

Regular expressions are a tool for scanning a text for strings which match a defined pattern. A regular expression stands for a set of character strings. A member of this set of strings is said to be matched by the regular expression. A pattern is constructed from one or more regular expressions.

Regular expressions comprise a string of characters, which can be further classified into:

ordinary characters, and
metacharacters.

All alphanumeric characters (all letters and digits) and most other characters are ordinary characters. Within a pattern, ordinary characters match themselves, i.e. the pattern abc will match only those strings that contain the character sequence abc anywhere in them.

There is, however, a small set of characters, known as metacharacters, which have special meanings when encountered in patterns. These characters are described below.

There are two forms of regular expression:

simple regular expressions
extended regular expressions

The syntax of these forms of regular expression is described in the following sections.

The following table shows which commands support regular expressions:

Command	Regular expression form
awk	extended
ed	simple
egrep	extended
ex	*)
expr	simple
grep	simple
lex	extended
nl	simple
sed	simple
vi	*)

*) The ex and vi commands process regular expressions which differ in certain respects from simple regular expressions. These differences are described under ex and vi.

Simple regular expressions

Simple regular expressions are constructed as follows

No.	Regular expression	Stands for	Example	Matching strings
1	c	The character c, where c is not a special character (metacharacter).	a	a
2	\c	The character c, where c can be any character other than ( ) { } 1 2 3 4 5 6 7 8 9. Regular expressions in this form are meaningful if c is a metacharacter. \c then stands for character c itself, as the backslash escapes its special meaning as a metacharacter.	\a \*	a *
3	.	Any single character	.	a, x, *, ...
4	[s] [c1-c2 ]	Any character from s, where s is a set of characters. If a right square bracket ] is to be one of the characters in the set, it has to be placed first in the set. If a hyphen - is to be one of the characters in the set, it has to be placed first or last. If a caret ^ is to be one the characters in the set, it can be placed anywhere but first. Any character in the range c1 to c2, in accordance with the EBCDIC sort sequence (inclusive of limits c1 and c2). c1 must come before c2 in the EBCDIC collating sequence. If it does not, c1-c2 does not denote a range but simply stands for the characters c1 and c2. The two forms can be combined: [s1c1-c2s2 ]	[mz] []a] [-a] [a-] [a^] [a-m] [m-a] [ado-qxz]	m, z ], a -, a -, a a, ^ a, m and any character in between in the EBCDIC collating sequence m, a a, d, o, q, x, z and any character coming between o and q in the EBCDIC collating sequence
5	[^s] [^c1-c2 ]	Any character not included in set s. Any character not in the range between c1 and c2 inclusive. Refer also to [c1-c2]. The two forms can be combined: [^s1c1-c2s2]	[^xyz] [^0-9] [^a0-9b]	any character except x, y, z any character except 0, 9 and all characters coming between 0 and 9 in the EBCDIC collating sequence) any character except a, b, 0, 9 and all characters coming between 0 and 9 in the EBCDIC collating sequence
6	r*	Zero, one or more occurrences of regular expression r. r has to be of form 1-5, 12, 15 or 16.	a*	nothing, a, aa, aaa, ...
7	r\{m,n\} r\{m\} r\{m,\}	At least m and at most n occurrences of regular expression r. r has to be of form 1-5, 12, 15 or 16. Precisely m occurrences of regular expression r. r has to be of form 1-5, 12, 15 or 16. At least m occurrences of regular expression r. r has to be of form 1-5, 12, 15 or 16.	a\{1,2\} a\{3\} a\{3,\}	a or aa aaa aaa, aaaa, aaaaa, ...
8	rx	(Concatenation) An occurrence of regular expression r followed by an occurrence of regular expression x. r and x can be any regular expressions.	[ab].	ax, a3, a*, bz, ...
9	^r	An occurrence of regular expression r appearing at the start of a line, i.e. straight after a newline character or at the start of the file. r can be a regular expression in any form other than number 9.	^[aA]pple	apple or Apple at the start of a line
10	r$	An occurrence of regular expression r at the end of a line, i.e. directly before a newline character. r can be a regular expression in any form other than number 10.	[bB]arge$	barge or Barge at the end of a line
11	$r$	Occurrences of regular expression r. r can be any regular expression. Only useful together with number 12	$[aA]pple$	apple, Apple
12	\n	n is an integer in the range from 1 to 9. \n appearing in a concatenated regular expression stands for regular expression x, where x is the nth regular expression enclosed in $ and $ sequences that appeared earlier in the concatenated regular expression.	$a\(b$\)\2 s$illy$b\1 $ab$x\1*	abb sillybilly abx, abxab, abxabab, ...

Precedence

The precedence of operators in regular expressions is as shown in the following table.

Operator	Precedence
[. .] [= =] [: :]	high precedence
\<char>	.
[ ]	.
( )	.
* ? + \{m,n\}	.
Concatenation	.
^ $	.
\|	low precedence

Metacharacters

Metacharacter	The character to the left has a special meaning if
\	it is not preceded by a backslash \
. [	it is not preceded by a backslash \ and it does not appear between [ and ]
*	it is not preceded by a backslash \, it does not appear between [ and ], it is not the first character in a pattern and it does not come after \)
$	it is the last character in a pattern
^	it is the first character in a pattern it is the first character in square brackets [ ... ]
-	it is in square brackets but not placed first or last
Regular expression delimiter such as /.../	it is not preceded by a backslash \
[. [= [:	Character pairs to the left are special characters if they occur within a bracket expression (in square brackets). They will need to be closed by the corresponding character pair .], =] or :]. Example: [[:upper:]] indicates all uppercase letters.

Extended regular expressions

Extended regular expressions include the regular expressions with the following exception:

The construction used for simple regular expressions $...$ has no special significance for extended regular expressions, for example the extended regular expression $ab$ represents the string (ab).

Moreover,extended regular expressions provide the following syntax elements for pattern creation:

No.	Regular expression	Stands for	Example	Matching stings
7	r{m,n} r{m} r{m,}	At least m and at most n occurrences of regular expression r. r has to be of form 1-5, 12, 15 or 16. Precisely m occurrences of regular expression r. r has to be of form 1-5, 12, 15 or 16. At least m occurrences of regular expression r. r has to be of form 1-5, 12, 15 or 16.	a{1,2} a{3} a{3,}	a or aa aaa aaa, aaaa, aaaaa, ...
13	r+	One or more occurrences of regular expression r. r has to be of form 1-5, 15 or 16.	u+	u, uu, uuu, ...
14	r?	Zero or one occurrence of regular expression r. r has to be of form 1-5, 15 or 16.	u?	nothing or u
15	(r)	Strings matching regular expression r. r can be any regular expression.	(ok(abc)) (au)*	okabc nothing or au, auau, ...
16	(r1/r2)	Strings matching regular expression r1 or regular expression r2.	(ok?ko)	ok or ko

Precedence

The precedence of operators in extended regular expressions is as shown in the following table.

Operator	Precedence
[. .] [= =] [: :]	high precedence
\<char>	.
[ ]	.
( )	.
* ? + {m,n}	.
Concatenation	.
^ $	.
\|	low precedence

Examples

Simple regular expressions

Pattern	Meaning	Matching strings
ab.d	a - b - any one character - d	abcd, abXd, ab*d, ...
ab.*d	a - b - any string (including the null string) - d	abd, abxd, abX*Yd, ...
ab[xyz]d	a - b - either x or y or z - d	abxd, abyd, abzd
ab[^c]d	a - b - any character other than c - d	abbd, abXd, ab*d, ...
^abcd$	a line containing only the string abcd

Extended regular expressions
Pattern
Meaning
Matching strings
ab.+d
a - b - any sequence of one or more characters
- d
abjd, abX*Yd, ...
abc?d
a - b - c or nothing - d
abd, abcd
(abc|xyz)
abc or xyz
abc, xyz

Your Browser is not longer supported

Please use Google Chrome, Mozilla Firefox or Microsoft Edge to view the page correctly

Regular POSIX shell expressions

Simple regular expressions

Precedence

Metacharacters

Extended regular expressions

Precedence

Examples