Administrator Guide : Using solidDB® data management tools : solidDB® Speed Loader (solloado and solload) : Control file syntax
  
Control file syntax
The control file syntax has the following characteristics:
Keywords must be given in capital letters.
Comments can be included using the standard SQL double-dash (--) comment notation.
Statements can continue from line to line with new lines beginning with any word.
The control file begins with the statement LOAD [DATA] followed by several statements that describe the data to be loaded. Only comments or the OPTIONS statement can optionally precede the LOAD [DATA] statement.
Syntax element
Definition
control_file
::= [option_part]
load_data_part
into_table_part
fields
column list
option_part
::= OPTIONS (options)
options
::= option [, option]
option
::= [SKIP = int literal] [ERRORS = int literal]
load_data_part
::= LOAD [DATA] [characterset_specification]
[DATE date_mask]
[TIME time_mask]
[TIMESTAMP timestamp_mask]
[INFILE filename]
[PRESERVE BLANKS]
characterset_specification
::= CHARACTERSET { NOCONVERT |
NOCNV |
ANSI |
MSWINDOWS |
PCOEM |
UNICOMPC |
SCAND7BIT }
into_table_part
::= INTO TABLE tablename
fields
::= [FIELDS {termination | enclosure}]
termination
::= TERMINATED BY termination_char [[OPTIONALLY] enclosure]
termination_char
::= WHITESPACE | 'char' | "char" | hex literal
enclosure
::= ENCLOSED BY enclose_char [AND enclose_char]
enclose_char
::='char' | "char" | hex literal
hex_literal
::= X'hex byte string'
column_list
::= column [, column]
column
::= column name datatype spec
[POSITION (int literal {: | -} int literal)]
[DATE date mask]
[TIME time mask]
[TIMESTAMP timestamp mask]
[NULLIF BLANKS | NULLIF NULLSTR| NULLIF 'string' | NULLIF ((int literal {: | -} int literal) = 'string')]
datatype_spec
::= {BINARY | CHAR [(length) ] | DATE |
DECIMAL [(precision [ , scale ])] |
DOUBLE [PRECISION] | FLOAT [(precision)] | INTEGER |
LONG VARBINARY | LONG VARCHAR |
NUMERIC [( precision[,scale ])] |
REAL | SMALLINT | TIME |
TIMESTAMP [ ( timestamp precisionv ) ] |
TINYINT | VARBINARY | VARCHAR [ (length ) ] }
See also
Reserved words
solidDB® Speed Loader (solloado and solload)
Reserved words
The solidDB® Speed Loader reserved words must be enclosed in double quotation marks if they are used as data dictionary objects, that is, table or column names. The following list contains all reserved words for the solidDB® Speed Loader control file:
AND
ANSI
APPEND
BINARY
BLANKS
BY
CHAR
CHARACTERSET
DATA
DATE
DECIMAL
DOUBLE
ENCLOSED
ERRORS
FIELDS
FLOAT
UNICOMPC
INFILE
INSERT
INTEGER
INTO
LOAD
LONG
MSWINDOWS
NOCNV
NOCONVERT
NULLIF
NULLSTR
NUMERIC
OPTIONALLY
OPTIONS
PCOEM
POSITION
PRECISION
PRESERVE
REAL
REPLACE
SCAND7BIT
SKIP
SMALLINT
TABLE
TERMINATED
TIME
TIMESTAMP
TINYINT
VARBIN
VARCHAR
WHITESPACE
CHARACTERSET keyword in solidDB® Speed Loader
The CHARACTERSET keyword is used to define the character set used in the input file. If the CHARACTERSET keyword is not used or if it is used with the parameter NOCONVERT or NOCNV, no conversions are made.
Use the parameter as follows:
ANSI for the ANSI character set
MSWINDOWS for the Windows character set
PCOEM for the ordinary PC character set
UNICOMPC for the UNICOM PC character set
SCAND7BIT for the 7-bit character set containing Scandinavian characters
DATE, TIME, and TIMESTAMP keywords in solidDB® Speed Loader
The DATE, TIME, and TIMESTAMP keywords can be used in two places with different functionality:
When a keyword is used as a part of the load-data-part element, it defines the format used in the import file for inserting data into any column of that type.
When a keyword appears as a part of a column definition, it specifies the format used when inserting data into that column.
Note Masks that are used as part of the load-data-part element must be in the following order: DATE, TIME, and TIMESTAMP. Each is optional.
Data must be of the same type in the import-file, the mask, and the column in the table into which the data is loaded.
Data type
Available data masks
DATE
YYYY/YY-MM/M/B-DD/D
TIME
HH/H:NN/N:SS/S
TIMESTAMP
YYYY/YY-MM/M/B-DD/D HH/H:NN/N:SS/S
Mask parts:
– Year masks: YYYY and YY
– Month masks: MM, M, and B (B refers to a three-letter abbreviation (case insensitive) of the month in English)
– Day masks: DD and D
– Hour masks: HH and H
– Minute masks: NN and N
– Second masks: SS and S
Masks within a DATE mask may be in any order; for example, the DATE mask could be MM-DD-YYYY (12-18-2010) or DD-B-YYYY (18-DEC-2010).
If the date data of the import file is formatted as 1995-01-31 13:45:00, use the mask YYYY-MM-DD HH:NN:SS.
The masks must be separated.
DATE example in Control File
The following example uses the POSITION keyword.
OPTIONS(SKIP=1)
LOAD DATA
RECLEN 12
INTO TABLE SLTEST2
(
    ID POSITION(1:2) NULLIF BLANKS,
    DT POSITION(3:12) DATE 'DD.MM.YYYY' NULLIF ((4:6) = ' ')
)
DATE, TIME, and TIMESTAMP examples in Control File
The following example uses the FIELDS TERMINATED BY keyword.
LOAD
DATE 'MM/DD/YY'
TIME 'HH-NN-SS'
TIMESTAMP 'HH.NN.SS YY/MM/DD'
INTO TABLE SLTEST3
FIELDS TERMINATED BY ','
(
    ID,
    DT, TM, TS
)
PRESERVE BLANKS
The PRESERVE BLANKS keyword is used to preserve all blanks in text fields.
INTO_TABLE_PART
The into_table_part element is used to define the name of the table and columns that the data is inserted into.
FIELDS ENCLOSED BY
The FIELDS ENCLOSED BY clause defines delimiting characters around each field. The delimiter can be a single character or two separate characters that precede and follow each data field in the input file. You might use a single character (such as the double quotation mark character) or a pair of characters (such as left and right parentheses) to delimit your fields. If you use double quotation marks as the delimiter and the comma as the terminator/separator, your input might look like the following:
"field1", "field2"
If you use left and right parentheses, your input might look like the following:
(field1),(field2)
If the keyword OPTIONALLY is used, the delimiters are optional and do not need to appear around every single piece of data.
If you specify a character value, it must be enclosed in single or double quotation marks. For example, the following examples have the same effect:
ENCLOSED BY '(' AND ')' ENCLOSED BY "(" AND ")"
You can even use the single quotation marks to surround one enclosing character and double quotation marks to surround the other, for example: ENCLOSED BY '(' AND ")"
Because using two conventions is potentially confusing, it is not recommended. Instead, use single quotation marks unless you are using a single quotation mark itself as the enclosing character, for example: ENCLOSED BY "'" AND "'"
If you are using single quotation marks as the enclosing characters, you must double the apostrophes as shown in the clause above. For example, to produce ‘Didn’t I warn you?’ in the database, the input must be as follows:
'Didn''t I warn you?'
Almost any printable characters can be used as the “enclosing” characters. The enclosing characters can also be specified using the hexadecimal format. For example, if a hexadecimal string is used, the format is:
X 'hex_byte_string'
For example, X'3a' means 3A hexadecimal value and specifies the colon (:).
The opening and closing characters in an enclosing pair can be identical. For example, the following is valid inside the control file:
ENCLOSED BY '"' AND '"'
If both the opening and closing characters are the same, the ENCLOSED BY clause needs to show the character only once. For example, the following clauses have the same effect:
ENCLOSED BY '"' ENCLOSED BY '"' AND '"'
The following examples show the input in the control file and the corresponding values stored in the table:
"Hello." Hello.
"""Ouch!"", he cried." "Ouch!", he cried.
"""He said her last words were ""I'll never quit!""""" "He said her last words were "I'll never quit!""
"""He said: ""Her last words were ""I'll never quit!""""""" "He said: "Her last words were "I'll never quit!"""
You can use enclosing characters in the column data itself (embedded field separators). If you use embedded field separators, you can use the TERMINATED BY clause together with the OPTIONALLY ENCLOSED BY clause to ensure that the column data is enclosed correctly.
ENCLOSED BY input rules and examples
This section contains basic rules and examples when using enclosing characters. Each example, unless stated otherwise, contains the following control file lines:
FIELDS TERMINATED BY X'3a' OPTIONALLY ENCLOSED BY "(" AND ")"
The enclosing characters are parentheses and the separator (terminator) character is the colon — hexadecimal 3A specifies the colon (:).
The data is to be loaded into a table with two columns. The first column is of type VARCHAR and the second of type INTEGER.
Treatment of enclosed characters within the data
The ENCLOSED BY characters themselves can occur within the data. However, when occurring within the data, each of the enclosing characters needs to occur twice in the input for each occurrence in the database.
If the input file contains (David Bowie ((born David Jones)) released 'Space Oddity"):1972, it produces the following format in the database:
David Bowie (born David Jones) released 'Space Oddity":1972
Deeply nested parentheses work the same way. If the input file contains (You((can((safely((try))this))at))home.):2, it produces the following value in the first column of the table:
You(can(safely(try)this)at)home.
Treatment of final enclosing character
The final enclosing character must occur an odd number of times at the end of the input. For example to get the format American Pie (The Day The Music Died) in the database, the input file must contain the following:
(American Pie ((The Day The Music Died)))
Of the last three closing parentheses, the first two are treated as a single instance of the character, while the last one is treated as the enclosing character.
Embedding newline characters
When enclosing characters are used, newline characters (carriage return or line feed) can be embedded within a string. For example:
(This long line that can be split across two or more input
lines ((and keep the end-of-line characters)) if the enclosing
characters are used):1
If the field separator (colon) is not used in the data and if there is no need to preserve new lines in the input data, only the field separator (not the enclosing characters) is required in the input data.
If your data is fixed-width, you do not need either the separator or the enclosing characters.
FIELDS TERMINATED BY
The FIELDS TERMINATED BY clause is used to define the separator character that distinguishes where fields end in the input file. The character must be specified in one of the following three ways:
Surrounded by double quotation marks, for example: ":"
Surrounded by single quotation marks, for example: ':'
In hexadecimal format, for example: X'3A'
When using hexadecimal format, the quotation marks must be single quotation marks, not double quotation marks.
The FIELDS TERMINATED BY clause specifies a separator, not a true terminator. The specified character is not required after the last field. For example, if the colon is the separator, the following two data file formats are equivalent and valid:
1:2:3:
or
1:2:3
The trailing colon after the final field is accepted but not required.
The OPTIONALLY ENCLOSED BY clause is used after the FIELDS TERMINATED BY clause when the character used to enclose the column data is contained in the column data itself. Following is a control file example:
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY "'"
In the example above, the separator is a comma.
The single quotation mark is defined as the character that encloses embedded field separators (commas) in the data file. The OPTIONALLY ENCLOSED BY clause may use either single or double quotation marks to delimit the enclosing characters.
For example:
OPTIONALLY ENCLOSED BY '('AND")"
The above illustrates the use of both single and double quotation marks for enclose_char in the syntax:
ENCLOSED BY enclose_char [AND enclose_char]
The following example summarizes the use of separators and enclosing characters. In this example, the : (colon) is defined as the separator (FIELDS TERMINATED BY) and the parentheses are used to enclose the : (colon), which is embedded in the field and cannot be interpreted as a separator. The example also contains two fields, the first of which is VARCHAR and the second of which is INTEGER.
Data file example
(This colon : is enclosed by parentheses and is not a separator):12345
Control file example
LOAD DATA
CHARACTERSET MSWINDOWS
INFILE 'test6.dat'
INTO TABLE SLTEST
FIELDS TERMINATED BY X'3a' -- X'3a' == ':'
OPTIONALLY ENCLOSED BY '(' AND ")"
(
   TEXT,
   ID
)
POSITION
The POSITION keyword is used to define position of a field in the logical record. Both the start and the end position must be defined.
NULLIF
The NULLIF keyword is used to give a column a NULL value if the appropriate field has a specified value. An additional keyword specifies the value the field must have. The keyword BLANKS sets a NULL value if the field is empty; the keyword NULL sets a NULL value if the field is the string “NULL”; the definition string sets a NULL value if the field matches the string 'string'; the definition '((start : end) = 'string')' sets a NULL value if a specified part of the field matches the string 'string'.
Using NULLIF keyword with keyword BLANKS
The following example shows the use of the NULLIF keyword with the keyword BLANKS to set a NULL value if the field is empty. It also shows the use of the keyword NULL to set a NULL value if the field is the string “NULL”.
LOAD
INFILE 'test7.dat'
INTO TABLE SLTEST
FIELDS TERMINATED BY ','
(
   NAME  VARCHAR NULLIF BLANKS,
   ADDRESS VARCHAR NULLIF NULL,
   ID   INTEGER NULLIF BLANKS
)
Using NULLIF keyword with keyword BLANKS
The following example uses the definition '((start : end) = 'string')' for the third field in the input file. This syntax only works with fixed-width fields because the exact position of the 'string' must be specified.
LOAD
INFILE '7b.dat'
INTO TABLE t7
(
   NAME CHAR(10) POSITION(1:10) NULLIF BLANKS,
   ADDRESS CHAR(10) POSITION(11:20) NULLIF NULL,
   ADDR2 CHAR(10) POSITION(21:30) NULLIF((21:30)='MAKEMENULL')
)
Note that in this example, the string is case‑sensitive: that is, 'MAKEMENULL' and 'makemenull' are not equivalent.
See also
Control file syntax