Data Model > Available DSCs > SPSS Statistics SAV DSC > Reading from an SPSS Statistics .sav file
 
Reading from an SPSS Statistics .sav file
The SPSS Statistics SAV DSC can open an existing .sav file and read the data, or it can read the data in a new .sav file that it has just created and populated while the connection to the new file is still open.
When you use the SPSS Statistics SAV DSC to read a .sav file, you can access the case data in three different ways:
With an MDM document. This can be either the MDM document in an existing .mdd file or an MDM document that has been created "on the fly" using the MDM Object Model.
By generating an MDM document from the .sav file. That is, you use the MDSC capability of the SPSS Statistics SAV DSC to read the metadata in the .sav file and present the metadata as an MDM document.
With no metadata.
Which method you use fundamentally affects the way that the data is presented.
Accessing the case data with an MDM document
The SPSS Statistics SAV DSC attempts to present the data as it is defined in the MDM document. This means that users generally do not require knowledge of the structure, names, and category values of the variables in the IBM SPSS Statistics .sav file.
The SPSS Statistics SAV DSC expects the name and location of the .sav file to be stored in the mrSavDsc DataSource object in the MDM document, and problems can occur if this information is missing or incorrect. The MDM document contains the .sav file variable names (aliases and subaliases) and category value mapping information for the mrSavDsc DataSource object. The SPSS Statistics SAV DSC uses this information to display the variables as they are defined in the MDM document.
By default, category values are presented as the MDM unique category values. However, they can optionally be displayed as native values, which are category indexes in the range of 1 to the number of categories in the variable. Native values are not necessarily the actual values stored in the .sav file. For multiple dichotomy sets, the native value identifies the particular member variable of the set. For other IBM SPSS Statistics variables that the SPSS Statistics SAV DSC maps to MDM categorical variables, the native value identifies the value label that represents the category. If you want to use the native values, set the MR Init Category Values connection property (see Connection properties) to 1.
When there is a mismatch between the IBM SPSS Statistics variable type and the MDM variable type, the SPSS Statistics SAV DSC attempts to convert the IBM SPSS Statistics variable value to the MDM variable type. If this is not possible, the action taken depends on the setting of the MR Init Allow Dirty connection property (see Connection properties). When MR Init Allow Dirty is True, the displays null values rather than generating an error message.
Generating an MDM document from the .sav file
The SPSS Statistics SAV DSC displays the variables as they are defined in the .sav file. Category values are always presented as native values.
This method is similar to accessing the case data with an MDM document, because the MDSC capability of the SPSS Statistics SAV DSC generates an MDM document in your computer's memory. The difference is that the generated MDM document is synchronized with the metadata in the .sav file and cannot contain any incorrect or redundant information.
However, if you have a valid .mdd file that was created from the .sav file using the SPSS Statistics SAV DSC, it is faster to connect using the .mdd file, and it also allows you to use properties and settings (see Properties and settings used by the SPSS Statistics SAV DSC) that can be set only on MDM variables.
Accessing the case data with no metadata
Reading the case data in a .sav file without specifying a metadata source is significantly faster than accessing the case data with an MDM document or generating an MDM document from the .sav file. Variable names are read directly from the .sav file and are not converted to valid MDM variable names. Any modification of the variable names that occurs is related to the conversion from multibyte to Unicode, the success of which depends on the user's locale. (This conversion also happens when you use an existing MDM document or generate an MDM document from the .sav file.) In addition, it is not possible to use the MDM mapped category values and some of the value mappings can differ from when a metadata source is used.
Connecting without a metadata source is not recommended for use with tools that expect valid MDM variable names. It is primarily useful for low-level access of the CDSC interface by third-party programs.
See also
Variable definitions when reading from a .sav file
Missing values when reading from a .sav file
Tabulating IBM SPSS Statistics variables in UNICOM Intelligence Reporter
Troubleshooting when reading from a .sav file
SPSS Statistics SAV DSC