solidDB Help : solidDB Grid : Routing data in a grid : Node routing operations : Routing SELECT statements
  
Routing SELECT statements
Note Check the solidDB Release Notes for any limitations that are associated with using a grid in the current release.
In a solidDB Grid environment, you can partition table data across several grid nodes. When an SQL SELECT statement is executed on a partitioned table in one node (Node 1 in the following diagram), a distributed query runs. The distributed query involves the following steps:
1 The query is optimized (in the node where the query is executed) and a distributed query plan is created.
2 Query fragments are sent to all grid nodes that contain a primary replication unit for a partition of that table.
3 The results from each grid node are returned to Node 1 where they are consolidated, grouped, and ordered to provide the final result.
Note The isolation level must be set to READ COMMITTED in order to run distributed queries, see Setting the isolation level.
The following diagram shows an example of a distributed query execution:
Diagram shows an example of a distributed query execution
Running queries in a grid environment
When you run queries in a grid, you should be aware of the following factors:
You can restrict a query to run on a local node
If you want to restrict an SQL read query to run on the local node, run the following statement before executing the query:
SET GRID READ LOCAL
To restore the default behavior where SQL read queries are distributed, run the following statement:
SET GRID READ DEFAULT
For more information, see SET GRID READ.
You can restrict the size of BLOB data returned in the result set
You can use the parameter Grid.DistributedReadMaxBlobSize to define the maximum supported BLOB size that can be accommodated in a distributed read query, see Grid section.
If BLOB data is part of the result set and the size limit reached, an error is returned and no data is returned. Depending on the scenario, one of these error messages is returned:
13641: "Distributed read fails. Blob size too large. Size actual_blob_size, max defined max_blob_size"
13642: "ODBC failure when executing a distributed read. odbc_error_msg, code error_code"
13643: "Connection failure during distributed read. failed_function" (Typically, the grid node setup is not correct).
SELECT FOR UPDATE statements are not supported
The following error (or similar) is returned if a statement might cause write operations:
SQLPrepare failed 'SOLID Table Error 13633: In Grid writes are allowed only to a local database.', naterr 13633
Other errors might occur when accessing remote nodes
The following error situations might occur:
If a remote node cannot be reached, an empty result set is returned to the requesting node and the SQL statement fails with the following error:
13643: Connection failure during distributed read: internal_message
If any ODBC errors occur during remote node access, an error is returned without any results:
13642: "ODBC failure when executing a distributed read. odbc_error_msg, code error_code"
Erratic query results might be the result of changes in the grid configuration or the temporary absence of a primary replication unit for a partition.
Query response times
You should not expect to see a significant improvement in the response time of a distributed query in a grid environment compared with that of a similar query in a standalone environment. Although any valid read query is allowed, the grid is not optimized for improving the performance of complicated SQL queries.
Example
A grid is set up with four nodes (one leader and three followers).
A partitioned table is created by executing the following statement:
CREATE TABLE TGRID_200 (I INTEGER NOT NULL, J INTEGER NOT NULL)
PARTITION BY (I)
REPLICATION FACTOR 3;
The replication factor for the table is set to 3 meaning that each row of data is stored in the primary replication unit for the partition and two secondary replication units.
10 rows are inserted into the grid with column I set to values from 0 to 9.
The rows are partitioned by using column I as the partitioning key.
The following table shows the result of running local queries on each node (to return the data located on the individual nodes), and the result of a distributed query (that returns and consolidates data from all nodes):
 
Target
SQL statements
Result
Leader node (7 rows)
SET GRID READ LOCAL;
SELECT i FROM TGRID_200 ORDER BY i;
I: 0, 1, 2, 3, 4, 7, 9
Follower 1 (8 rows)
SET GRID READ LOCAL;
SELECT i FROM TGRID_200 ORDER BY i;
I: 0, 3, 4, 5, 6, 7, 8, 9
Follower 2 (7 rows)
SET GRID READ LOCAL;
SELECT i FROM TGRID_200 ORDER BY i;
I: 0, 1, 2, 3, 5, 6, 8
Follower 3 (8 rows)
SET GRID READ LOCAL;
SELECT i FROM TGRID_200 ORDER BY i;
I: 1, 2, 4, 5, 6, 7, 8, 9
All grid nodes (10 rows)
SET GRID READ DEFAULT;
SELECT i FROM TGRID_200 ORDER BY i;
I: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Note Because of the additional replica copies of partitioned data, the last query in the table still gives the same results even if one of the nodes is not available.
Query execution plans
The following examples show the execution plans for a distributed query and a local query:
Distributed query
solsql> SET GRID READ DEFAULT;
Command completed successfully, 0 rows affected.
solsql> EXPLAIN PLAN FOR SELECT * FROM TGRID_200;
      ID UNIT_ID PAR_ID JOIN_PATH UNIT_TYPE INFO
      -- ------- ------ --------- --------- ----
      1 1 0 2 JOIN
      2 2 1 0 TABLE TGRID_200
      3 2 1 0 SCAN TABLE
      4 2 1 0 DISTRIBUTED
      4 rows fetched.
Local query
solsql> SET GRID READ LOCAL;
Command completed successfully, 0 rows affected.
solsql> EXPLAIN PLAN FOR SELECT * FROM TGRID_200;
      ID UNIT_ID PAR_ID JOIN_PATH UNIT_TYPE INFO
      -- ------- ------ --------- --------- ----
      1 1 0 2 JOIN
      2 2 1 0 TABLE TGRID_200
      3 2 1 0 SCAN TABLE
      3 rows fetched.
For details of the EXPLAIN PLAN FOR statement, see EXPLAIN PLAN FOR.
Go up to
Node routing operations