Your Browser is not longer supported

Please use Google Chrome, Mozilla Firefox or Microsoft Edge to view the page correctly
Loading...

{{viewport.spaceProperty.prod}}

regcmp, regex - compile and execute regular expression

&pagelevel(4)&pagelevel

Syntax

#include <libgen.h>

char *regcmp (const char *string1 [, char *string2, ...] / * , (char *) 0) */;
char *regex (const char *re, const char *subject [ , char *ret0, ... ] );
extern char *__loc1;

Description

regcmp() compiles the regular expression that is produced by concatenation of the arguments. The end of the argument chain is a null pointer. As the result, regcmp() returns a pointer to the expression which was compiled into an internal format. The memory for the compiled expression is provided via malloc(). The user is responsible for the release of the memory thus allocated if the space is no longer required.
The return of a null pointer by regcmp() indicates that an argument has an invalid value.

regex() searches for a pattern re compiled by regcmp() in the subject string. Additional arguments are passed to regex() to receive back matching partial expressions. If not enough arguments are specified for all returned hits, the behavior of regex() is undefined.

The global character pointer __loc1 points to the first matching byte in subject.

regcmp() and regex() have been largely taken over by the editor ed(), although the syntax and semantics were changed slightly. The valid symbols and their respective meanings are as follows:

[]*.^

These symbols have the same meaning as in ed().

$

This symbol is equivalent to the end of the string (\n is equivalent to a newline character).

-

A minus sign enclosed in brackets means through. So, for example, [a-z] means the same as [abcd...xyz]. The - can only mean ’minus’ if it is
used as the first or last character. So, for example, the expression []-] matches the characters ] and -.

+

A regular expression followed by a + means once or more. So, for example, [0-9]+ means the same as [0-9][0-9]*.

 { m } { m, } { m,u


Integer values enclosed in {} indicate the frequency with which the preceding regular expression is to be applied. The value m is the minimum number and u is the maximum. u must be less than 256. If only m is present (e.g. { m }), this specifies exactly how often the regular expression is to be applied. The value { m, } is the same as { m,infinite }. The operations with the plus sign + and the asterisk * * are equivalent to {1,} and {0,} respectively.

( ... )$ n

The value of the bracketed regular expression is to be returned. The value is stored in the (n+1)th argument after the subject argument. A maximum of ten bracketed regular expressions are permitted. regex() executes the assignments in all cases.

( ... )

Brackets are used for groupings. An operator, e.g. *, +, {}, can be applied to individual characters or to a regular expression enclosed in brackets. Example: (a*(cb+)*)$0.

All symbols defined above are special characters. They must therefore be preceded by a backslash \ if they are to stand for themselves.

Return val.

for regcmp():

Pointer to the compiled regular expression



if successful.


Nullzeiger

bei Fehler. errno wird gesetzt, um den Fehler anzuzeigen.

 

for regex():

Pointer to the next character in subject that does not match the pattern              

 


if successful.


Nullzeiger

if an error occurs.

Errors

regcmp() will fail if: 


ENOMEM

There is no longer enough memory available.            

Notes

The user program may run out of memory if regcmp() is called iteratively without release of the arrays that are no longer required.

If you use one of these functions you must link the libgen library to it at compilation (cc -lgen). 

Example 1

The following example searches for a leading newline character in the string subject pointed to by cursor.

char *cursor, *newcursor, *ptr;
 ...
newcursor = regex((ptr = regcmp("^\n", (char *)0)), cursor);
free(ptr);

Example 2

The following example searches for a string Testing3 and returns the address of the character after the last matching character (the character 4). The string Testing3 is copied into the character field ret0.

char ret0[9];
char *newcursor, *name;
 ...
name = regcmp("([A-Za-z][A-za-z0-9]{0,7})$0", (char *)0);
newcursor = regex(name, "012Testing345", ret0);

Example 3

In this example, a precompiled regular expression in file.i (see regcmp(1)) is checked against string.

#include "file.i"
char *string, *newcursor;
 ...
newcursor = regex(name, string);

See also

re_comp(), re_exec(), malloc()