Implementing Sync Pull Notify

solidDB implements Sync Pull Notify as a two-step process. First, the master database notifies the replica database; second, the replica database does a refresh operation.

Generally, the master database notifies the replica database by calling a remote stored procedure. The replica database must already have created this stored procedure. Normally, this procedure will itself contain the appropriate REFRESH statement(s). (It is possible to use more indirect methods, but this approach is the shortest and simplest.)

One way to induce the master to call the remote stored procedure, is to create a trigger. For example, if the table named EMPLOYEES on a replica should be refreshed every time the table is changed in the master database, you could create INSERT, UPDATE, and DELETE triggers on the EMPLOYEES table in the master database. Each trigger would call the remote stored procedure (on the replica) that requests a refresh of the data.

Although the process of notifying a replica database is straightforward, there is no automated way to decide which replica databases should be notified. Also, the person who creates or updates the triggers on the master database must know the name of the stored procedure to call on each replica database that should be notified.

Similarly, the replica database must have REFRESH statements that specify the appropriate publications. There is no automated way of having the master database determine which replica databases subscribe to which publications and then automatically create triggers to call procedures on those replica databases. (This would not be sufficient anyway because the master database would also have to compose the appropriate stored procedure and force the replica database to create that stored procedure.)

Whether you use a trigger or a START AFTER COMMIT or some other method to notify replica databases that data in the master database has been updated, it is important to remember that only the specified servers are notified.

If you use a mechanism to run synchronization very frequently, make sure that you are aware of the possible performance consequences. Because synchronization messaging is based on a store & forward architecture (that writes data to disk before sending it over the network), there is always some overhead related to synchronization messaging.

A solidDB server is capable of performing hundreds or thousands of update operations per second but it might be able to handle only a few dozen synchronization messages per second. If you write an update trigger into a frequently updated table to initiate synchronization for each occurred update, the update performance is limited by the synchronization messaging performance. Instead of achieving performance of 1000 updates per second, you might be able to run only ten updates per second because every update causes a synchronization message that contains only one row of changed data.

The remedy for this issue is to make the synchronization occur only, for example, every few seconds. This way, each synchronization message contains more than one row and the overhead of the synchronization message is only a fraction of the worst case scenario.

To understand the DEFAULT keyword in remote stored procedures, you must understand both remote stored procedures and the START AFTER COMMIT statement.

A START AFTER COMMIT statement can contain the following optional WHERE clause:

FOR EACH REPLICA [WHERE ... ];

The WHERE clause is used to find replicas with given property names and values. For information how to identify replica databases using property names, see Replica property names.

or, to apply the statement to all replicas:

START AFTER COMMIT FOR EACH REPLICA
UNIQUE CALL my_proc;

The server that is executing this statement effectively creates a list of all the replica databases that match the WHERE clause and then calls the specified procedure (my_proc) for each replica database on the list. While that stored procedure is running, the DEFAULT clause identifies the replica database that is currently being processed. For example, suppose that there are three replica databases that match the WHERE clause, that is, three replicas for which region = 'west':

The my_proc procedure is then called three times. During the first call, DEFAULT is equal to "California"; during the second call, DEFAULT is equal to "Oregon"; and during the third call, DEFAULT is equal to "Washington". If the stored procedure named my_proc wants to call a remote stored procedure (remote_proc) on each of the three replica databases, it can do so by using the syntax:

Each time that my_proc is called, DEFAULT is set to the name of one of the three replica databases, so the effect is to call each of the following procedures:

If remote_proc contains a statement to refresh from the master, then the effect will be that all three replica databases are notified that there is updated data and they should refresh. Thus, by using a combination of START AFTER COMMIT and remote procedure calls, you can notify every replica database that needs to refresh the data.