MDM integration

When you use the Case Data Model (CDM) to connect to the case data, you can connect with or without metadata. The metadata can be in the form of a Metadata Document (.mdd) file or a data format for which an MDSC is available. If the metadata is in an .mdd file, the Metadata Model (MDM) uses it to populate an MDM Document. If the metadata is stored in another format, the MDSC populates an MDM Document.

When you connect using an MDM Document, the CDM uses it to resolve variable names. This means that the MDM Document defines the schema of the virtual tables returned by the CDM. This has a number of advantages:

Although the case data generally contains storage for every variable created in all of the metadata versions, using the MDM Document means that the CDM returns only those variables that are in the version or versions selected for the connection. The CDM uses the variable instances collection in the MDM Document (Document.Variables) to resolve which variables are required. When you connect to the case data without an MDM Document, variables are returned regardless of the version.

The variable names are defined by the MDM Document regardless of the variable names in the underlying data storage. For example, IBM SPSS Statistics.sav files have different variable naming rules from the MDM. When you export UNICOM Intelligence data to a IBM SPSS Statistics.sav file, creates alternative variable names (called aliases) for use in the .sav file and they are stored in the .mdd file. When you subsequently connect to that .sav file using the same .mdd file (and DataSource object), the CDM will return the familiar variable names from the MDM Document. However, if you connect to the .sav file without the .mdd file, the CDM will return the shorter variable names used in the .sav file.

Not all variables defined in the metadata have actual storage allocated in the case data. In particular, variables derived from other variables using an expression do not have storage allocated for them. (These variables are sometimes called dynamically derived variables.) The CDM determines whether a variable is derived and if so, extracts the derived expression and creates a column in the virtual table as if the variable existed in the case data. When you connect to the case data without an MDM Document, dynamically derived variables are not available.

The (Provider) attempts to keep the CDM and MDM synchronized when an ALTER TABLE statement is used. For example, when a column is added, the Provider adds a variable to the MDM Document and when a column is dropped, the Provider deletes the corresponding variable in the MDM Document. However, this requires the MDM Document to be opened in read/write mode and the latest version of the document to be unlocked.

While not currently implemented, it is planned that permissions to access a particular variable will be maintained by the MDM. The CDM will then need to check access permissions before making an SQL operation available to a user connection. Possible user privileges on variables will be SELECT, INSERT, UPDATE, and DELETE. Privileges at a table level will include CREATE and ALTER.

When a variable is present in the MDM Document but not in the case data, the CDM creates a NULL column. Conversely, a variable that is present in the case data but not in the MDM Document will not be visible when you connect using an MDM Document. These scenarios typically arise when variables are added and deleted in successive versions of the metadata.

If an MDM document has not been loaded with the CDM connection, variable name resolution occurs in the CDSC, version information is not available, and a variable superset is returned if all columns are requested.