Design the logical database

The data modeling of a multi-database system is slightly different from that of a centralized system. The tentative nature of replica data as well as the possible coexistence of multiple different versions of the same data item (row) must be handled properly in the logical data model. Therefore, there are some rules of thumb to consider when designing the logical database of a multi-database system.

All write operations that are executed in the master database must be successful. A “unique constraint violation” causes the entire synchronization process to halt. Therefore the primary keys (and unique indices) of the rows must be unique throughout the entire system. It is strongly recommended, that globally unique, surrogate primary key values are used in all tables of the database. This kind of key can for instance be a combination of database ID and a sequence number. For example:

CREATE TABLE CUST_ORDER
DB_ID VARCHAR NOT NULL, ID INTEGER NOT NULL,
... other columns of the table,
(PRIMARY KEY (DB_ID, ID));

The easiest method of detecting update conflicts is to use an “updatetime” column in each table of the system. Whenever a row is updated, the current (that is, pre-update) value of the updatetime column is appended to the WHERE clause. If the row is not found, it means that the updatetime has changed, that is, someone else has updated the row causing a conflict. This mechanism is known as “optimistic locking.”

It is always possible that an error can occur during synchronization. The error may be an application-level error such as update conflict in the master database. If such error cannot be automatically resolved, it should be logged for manual resolution.

One way to implement this is to create a synchronization error log table that contains an entry about each error. Whenever an application-level error occurs during synchronization, the stored procedures of the transactions should insert a row to the error log table. For a suggestion on how to create an error log, see Creating a synchronization error log table for an application.

The error may also be server-level error that halts the processing of the synchronization message. In this case, the error recovery is done at messaging level. For details, see Administering advanced replication systems.

The usage pattern of master and replica databases can be very different. The queries performed in the master database can also be very different from those of replica databases.

Due to these facts, the indexing of the databases should be carefully designed, bearing in mind the different usage patterns. The indexing of replicas should follow the requirements of the applications that are using the database. When designing an index for a master database, note that unique indexes must be globally unique. Also consider the following index guidelines for publications and transactions.

Be sure to index queries that are derived from publication definitions. solidDB^® treats nested publications as joins. To make the publication operation efficient, the joining columns of the tables of the publication must be indexed.

In the example that follows, CUSTOMER and SALESMAN tables are joined together using the SALESMAN_ID column of the CUSTOMER table. To allow efficient execution of refreshes from this publication, index the SALESMAN_ID column using a secondary index.

CREATE PUBLICATION
  pub_customers_by_salesperson (sperson_area varchar)
BEGIN
  RESULT SET FOR salesman
  BEGIN
    SELECT * FROM salesman where area = :sperson_area
    DISTINCT RESULT SET FOR customer
    BEGIN
      SELECT * FROM customer WHERE salesman_id = salesman.id
   END
  END
END;

In addition to the indices created by the user, solidDB^® automatically creates two system indices for tables that have the SYNCHISTORY property set on. The same indices are also automatically created for the history tables of the main tables.

The write load of the master database sets the practical limits to the scalability of a advanced replication system. In a advanced replication system, all propagated transactions are eventually committed in the master database. Each index causes additional disk I/O in all write operations (insert, update, delete) to that table. For this reason, minimize the number of secondary indexes in the master database if the write performance is a critical factor.