Programmer Guide : Working with Unicode
  
Working with Unicode
solidDB® supports the Unicode standard, providing the capability to encode characters used in the major languages of the world. To use Unicode encoded data, you do not need to use any non-standard or solidDB®-specific implementations for application development; standard ODBC API or JDBC API can be used, as well as solidDB® tools. solidDB® also supports heterogeneous multi-client environments where each application can be set to use different encoding.
Unicode database modes
Starting from version 6.5, the solidDB® databases can be created in two modes: Unicode mode or partial Unicode mode. This database mode is based on the encoding of character data types (CHAR, VARCHAR and so on) in the solidDB® server. Wide character data types (WCHAR, WVARCHAR and so on) are Unicode encoded in both modes.
Unicode mode
In the Unicode mode, the internal representation for character data types is UTF-8.
The internal representation for wide character data types is UTF-16.
partial Unicode mode
In the partial Unicode mode, the internal representation for character data types uses no particular encoding; instead, the data is stored in byte strings with the assumption that user applications are aware of this and handle the conversion as necessary.
The internal representation for wide character data types is UTF-16.
The databases created with solidDB® version 6.3 or earlier are of the partial Unicode type.
Important: The default database mode in 6.5 is partial Unicode.
Note Unicode applications can be built on both Unicode and partial Unicode databases. However, the instructions in this section assume that the Unicode support is based on the Unicode database mode.
Key features of solidDB® Unicode databases
Storing and retrieving of Unicode data
The internal of representation of Unicode data is based on UTF-8 and UTF-16 encoding. Data in wide character column types is represented internally in UTF-16 and data in character column types is represented in UTF-8.
This means that both single and multi-byte data can be stored in character column types; if mainly multi-byte data is expected, you can optimize space-efficiency by choosing to store the multi-byte data into wide character column types.
No restrictions on the encoding used in the applications
solidDB® ODBC/JDBC drivers handle the conversion of data between the application encoding and the UTF-8/UTF-16 format in the solidDB® server.
Standard ODBC API and JDBC API available for application development
There are no non-standard or solidDB®-specific requirements for application development; standard ODBC API or JDBC API can be used.
See also
What is Unicode?
Designing Unicode databases
Using solidDB® tools with Unicode
Compatibility between Unicode and partial Unicode databases
Developing applications for Unicode