lex generates a C program from a file which contains the "lex source text" which you have developed for the problem in hand. A lex source text consists of a maximum of three sections: Definitions, rules and user functions. The rules specify the patterns which are searched for in an input text and the action which is taken if a pattern is found. The definitions and user functions are optional.
lex generates a file with the name lex.yy.c. If lex.yy.c is compiled and linked with the Lex library, it copies the input to the output unless a pattern specified in the file is found. In this case the corresponding program text is executed. The pattern which has been matched is located in yytext[], an external character field. Checking and matching of the input file is performed for the search patterns in sequence.
Syntax
lex[ -ctvnV][ -Q[y|n]][ file ...] |
represents the use of C responses and is the default
the program is written to the file lex.yy.c, not to the standard output
provides a two line statistical summary
prevents the printout of the summary generated by -v
outputs version information at the standard error output
determines whether or not version information is to be output to the output file
Input file. Multiple files are treated as a single file. file not specified Some standard table sizes are too small for some users.The table sizes for the automatons which are finally generated can be set in the definition section:
The use of one or more sizes automatically entails the option -v if the option -n is not used. The rules section of file starts with the delimiter %%. In the rules section you can define local variables for yylex(). In the rules section, all lines which start with a space or a tab and precede the first rule are copied to the start of the function yylex() directly after the first lefthand parenthesis. Each rule consists of a regular expression which describes a pattern which is to be located and actions which are to be performed if the pattern is found. Input text which corresponds to no search pattern is passed on unchanged to the input file by lex. A regular expression consists of text characters with or without additional operators. The following operators can be used with lex:
Special tasks can be performed in the action section of a rule. To this end, lex provides the following macros:
You can redefine these macros yourself if you want to control input/output yourself. In this case, ensure that consistency is maintained. Apart from the storage of detected patterns in yytext[] there are other ways of processing detected text patterns using lex functions:
|
Hint
If a lex program is linked with c89 [5], then -ll must be specified as the archive parameter. |
Locale
The following environment variables affect the execution of lex: LANG Provide a default value for the internationalization variables that are unset or null. If LANG is unset of null, the corresponding value from the implementation-specific default locale will be used. If any of the internationalization variables contains an invalid setting, the utility will behave as if none of the variables had been defined. LC_ALL If set to a non-empty string value, override the values of all the other internationalization variables. LC_COLLATE Determine the locale for the behavior of ranges, equivalence classes and multicharacter collating elements within regular expressions. If this variable is not set to the POSIX locale, the results are unspecified. LC_CTYPE Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single- as opposed to multi-byte characters in arguments and input files), the classification of characters as upper- to lower-case, and the mapping of characters from one case to the other. LC_MESSAGES Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error. NLSPATH Determine the location of message catalogs for the processing of LC_MESSAGES. |
See also
yacc |