Function group: table function
The table function CSV() enables you to use the content of a BS2000 file as a “read-only” table in any SQL statements.
CSV format (CSV: Comma Separated Values) is used to display SQL tables in files here. This is a standardized format for the platform-independent exchange of table data, see "Format of CSV files". The file contains the sequence of table rows, each row containing its column values sequentially as a string. Such files can be generated with a large number of software products (e.g. with Microsoft EXCEL).
CSV ([FILE]
file DELIMITER
delimiter [QUOTE
quote ] [ESCAPE
escape ],
data_type ,...)
FILE file
Name of the input file. You must specify file as an alphanumeric literal.
The input file must be a SAM file.
If the input file is not located in the ID of the DBH, the DBH ID must have read authorization for this file. Otherwise the DBH cannot access the input file.
If a read password is required for the file, this must be added to the BS2000 file in the form ?PASSWORD=<
password>
, e.g. ':8OSH:$ABC.MYFILE?PASSWORD=C''ABCD'''
.
password can be specified in several different ways:
C
''
string''
string contains four printable characters.X
''
hex_string'’
hex_string contains eight hexadecimal characters.n
n identifies an integer from - 2147483648 to + 2147483647.
DELIMITER delimiter
Delimiters (DELIMITER characters) between the column values of the CSV file. A DELIMITER character can also be part of a value, see the descriptions of quote and escape below.
delimiter must be specified as an alphanumeric literal with the length 1.
QUOTE quote
QUOTE characters in which the column values in the CSV file can be enclosed. These QUOTE characters are not part of the column value. A QUOTE character in the column value must be entered twice in the CSV file.
When a value is enclosed in QUOTE characters, it can also contain NEWLINE characters (which are not interpreted as a line break) or DELIMITER characters. A value consisting only of an opening and a closing QUOTE character is interpreted as a value with the length 0.
quote must be specified as an alphanumeric literal with the length 1.
When QUOTE is not specified, the column values in the CSV file cannot be enclosed in QUOTE characters.
ESCAPE escape
ESCAPE character with which ESCAPE sequences consisting of two characters in the input file begin.
ESCAPE sequences enable DELIMITER characters, QUOTE characters and ESCAPE characters to be written as part of a column value and NEWLINE characters to be ignored as a delimiter between two input lines.
escape must be specified as an alphanumeric literal with the length 1.
When ESCAPE is not specified, no ESCAPE sequences can be used in the CSV file.
data_type,...
Data types of the various columns in the table which is read from the CSV file.Every data_type must be data type CHARACTER(n) (where 1 <= n <= 256) or CHARACTER VARYING(n) (where 1 <= n <= 32000).
Result
A table with as many columns as data types which are specified, each with the specified data type.
Example
A new SERVICE_ENCR base table is set up. Its contents are taken from a CSV file.
INSERT INTO service_encr (setext, seprice_encr) SELECT a,b FROM TABLE(CSV(FILE 'out.service.070' DELIMITER ':', CHAR(25),VARCHAR(16))) AS t(a,b) |
Format of CSV files
The CSV format (CSV: Comma Separated Values) is a standardized format for the platformindependent exchange of table data. Such files can be generated and edited with a large number of software products (e.g. with Microsoft EXCEL).
Tables are presented in CSV files as a sequence of lines, the lines in a file being separated by (one or more) NEWLINE characters (line breaks). The transition to the next record in a SAM file is also such a new line, although this is not an EBCDIC character. A record in a SAM file can contain multiple lines, separated by a NEWLINE character. New line characters may also occur before the first and after the last line.
The various column values in a line are separated by a single DELIMITER character. A DELIMITER character may also occur after the last column value of a line.
There are two ways of presenting the various column values in each line: The individual characters in a column can be enclosed in QUOTE characters or not. In the first case the column values can also contain the NEWLINE and the DELIMITER characters. However, a QUOTE character in the column value must be entered twice (otherwise it terminates the column value). Column values in QUOTE characters can only be used if the QUOTE operand is specified in the CSV function.
If a column value does not begin with the QUOTE character (or if the QUOTE operand is not specified in the CSV function), the column value will end before the next DELIMITER or NEWLINE character.
In SESAM/SQL you can also define an ESCAPE character. The ESCAPE character enables you to use ESCAPE sequences in the column value, which are interpreted as follows:
Escape sequence | Interpreted as |
escape newline | “no character” |
escape delimiter | a DELIMITER character |
escape quote | a QUOTE character |
escape escape | an ESCAPE character |
ESCAPE sequences are also permitted in column values which are enclosed in QUOTE characters. ESCAPE NEWLINE in particular is useful, because when an ESCAPE character is contained at the end of a SAM record, the line is regarded as not yet completed and is continued with the following SAM record. The lines in a CSV file can thus be longer than one record in a SAM file of BS2000.
If errors occur when the CSV file is read or an infringement of the CSV format is detected (e.g. in the case of end of file in a column value which begins with a QUOTE character but does not end with one), this is indicated with an error code.
Note on NEWLINE characters
In CSV format four EBCDIC control characters are interpreted as a NEWLINE characters:
X'04' X'0D' | is the NEXT LINE character is the CARRIAGE RETURN character. Its ASCII equivalent is used as the newline |
X'15' | is the LINE FEED character. Its ASCII equivalent is used as the newline character |
X'25' | is the PRIVATE USE TWO character. However, in EBCDIC systems from IBM it is |
The CSV format accepts all these control characters (like the transition to the next record of a SAM file) as newline characters.
Syntax of a CSV file
A syntactical presentation of the format of a CSV file is provided on "Syntax overview of the CSV file".
Interpreting CSV files as an SQL table
In the CSV function the number of columns to be read and their data types are specified. These columns correspond to the column values in the CSV file in the same order. If a line in the CSV file contains fewer column values, NULL values are added. If a line in the CSV file contains more column values, the surplus column values are ignored.
A line in a CSV file must contain at least one character. Multiple consecutive newline characters are treated as one newline character.
An empty column value (e.g. between two consecutive DELIMITER characters) is interpreted as a NULL value.
A column value which is longer than the (maximum) length of the column’s data type is truncated. A warning is issued.
If the data type of the column is CHARACTER(n) but the column value is shorter than n, the column value is padded at the end with blanks (X'40').
A column value with the length 0 can be written with QUOTE characters, e.g. as "" if DELIMITER ';' QUOTE '"' is specified in the CSV function.
Restrictions in the use of CSV files
The BS2000 file is opened exclusively. It can therefore not be used simultaneously by the same or another SQL transaction in another CSV function. A remedy is offered by the CACHE annotation, in which the CSV is cached temporarily, see the “ Performance” manual.
If the file cannot be opened, an error message is issued and processing is terminated.
The file is closed only when the query containing it has been analyzed fully or when the query is no longer required (e.g. because the cursor which used the file is closed) or when the CSV file is cached.
In addition, there is a maximum number of CSV files (currently 4) which may be opened simultaneously. If this maximum number is exceeded, a corresponding error message is issued.
When one coded character set (CODE_TABLE
not equal to _NONE_
or CODED-CHARACTER-SET
not equal to *NONE
) each is defined for the database used and for the CSV file, the two names specified must be the same.