You should design a database schema that supports running a batch insert in primary key order. Data in the database file is stored physically in the order that is defined by the primary key of the table. If no primary key is defined, data is stored in the database file in the order that it is written to the database. Database operations (that is, reads and writes) always access data at the page level. The page size (block size) of the database is defined with the IndexFile.BlockSize parameter, see IndexFile section.
If the batch write operations are performed in the order that supports the primary key, the caching algorithms of the server are able to group the database file write operations. In this way, a larger number of rows are written to the disk in one physical disk I/O operation. In the worst case, if the insert order is different from the primary key order, each insert or delete operation requires rewriting a database page where only one row has changed.
For these reasons, it makes sense to ensure that tables of a batch write operation have primary keys that match the access order of the batch write operation. This type of database schema can make a significant difference in the performance of the operation.
For example, create a table with the following statement:
CREATE TABLE USAGE_EVENT ( EVENT_ID INTEGER NOT NULL PRIMARY KEY, DEVICE_ID INTEGER NOT NULL, EVENT_DATA VARCHAR NOT NULL);
In this table, EVENT_ID is a sequence number. The insert and delete operations are done in the order specified by the EVENT_ID column, allowing for maximum efficiency.
The performance of batch write operations on this same table can be significantly worse if the first column of the primary key were DEVICE_ID, but data was written to the database in the EVENT_ID order. In this scenario, the number of file I/O operations needed to complete the batch write operation increases when the size of the table grows.