The internal part of the solidDB server that takes care of storing in-memory tables (M-tables) is called the main-memory engine (MME). In addition to the actual data, the indexes for M-tables are also built in the main memory. solidDB uses a main-memory-optimized index technology, called tries, to implement the indexes.
The basic index structure in the in-memory engine is a VTrie (variable length trie) that is an optimized variation of the trie. A trie (from retrieval), is a multi-way tree structure that is widely used for storing strings. All strings that share a common prefix hang off a common node. For example, when the strings are words that use the {a..z} alphabet, a node has at most 27 children: one for each letter plus a terminator. VTrie uses a bitwise tree where a key is composed from individual bits, which allows keys to be any supported data type. VTrie uses nodes of a capacity of 8 bits. Consequently, each node has at most 257 children, that is, the fan-out is 257 (256 bits plus a terminator).
A simplified example of the VTrie structure with a node capacity of 2 bits and fan-out of four is shown in the following diagram.
The elements in a string can be recovered by using a scan from the root to the leaf nodes that ends a string. All strings in the VTrie can be recovered by a depth-first browse of the tree.
A competitive solution to VTrie is a binary search tree. In a binary tree, the node fan-out is two. In each node, you compare a full key value against a node separation value and then choose one of the two children to continue with.
The main advantages of VTries over binary search trees are as follows:
▪ Looking up keys is faster. Looking up a key of length m takes a length of time that is proportional to m. A binary search tree requires log2(n) comparisons of keys where n is the number of elements in the tree. The total search time is proportional to m log2(n). The advantage of VTrie is because no value comparisons are needed. Each part of a key (a "letter") is applied as an array index to a pointer array of a child node. Unlike a value comparison, array lookup is a fast operation if the array is cached in processor caches.
▪ Tries can require less space when they contain a large number of short strings because the keys are not stored explicitly and nodes are shared between keys with common prefix.
Several optimizations are used in VTrie to speed up retrieval when the key value space is not fully exhausted, as illustrated in the previous diagram. These are path compression, width compression, and fast termination:
▪ In path compression, all internal nodes with only one child are removed and a common prefix is stored in the remaining node.
▪ In width compression, only the required pointers are stored in the nodes and every node contains a bitmap that stores the information about which pointers are present in the node.
▪ In fast termination, a pointer to the data record is elevated to a node that represents a prefix that is not shared among the key values.