White papers > Data Management performance guidelines white paper > Data Source Components (DSCs) > SPSS SAV DSC
 
SPSS SAV DSC
Introduction
The SPSS SAV DSC reads data from, and writes data to, an SPSS *.sav file. It has these features:
SPSS SAV DSC features
Feature
UNICOM Intelligence database
Storage type
Single file
Read
Yes
Write/update
Yes
Unbounded loops (levels)
No
Native WHERE clause support
No
Compressed format
No
Multiple user read
Yes
Multiple user write
No
Data storage
The SPSS SAV DSC stores data in an SPSS *.sav file by using the SPSS I/O DLL. For more information about how UNICOM Intelligence data is written to a *.sav file, see Writing to an SPSS Statistics .sav file.
The SPSS SAV DSC overall storage file size (in MB) is as follows:
Quanvert database storage file size
skidemo
UNICOM Intelligence database
UNICOM Intelligence file
Quantum
Quanvert
SPSS
Total size
19
2.18
0.9
3.25
1.7
Backup
18.1
0.65
*
2
*
Similar to the Quantum CDSC, the SAV format is not suited for large, sparsely populated data sets. However, the SPSS SAV DSC has an acceptable storage size for small to medium sized data sets.
Read performance
The following table provides the SPSS SAV DSC read performance (measured in seconds):
skidemo
UNICOM Intelligence database
UNICOM Intelligence file
Quantum
Quanvert
SPSS
1 variable
0.062
0.062
0.109
0.031
0.062
5 variables
0.11
0.063
0.204
0.078
0.078
All variables
1.078
0.562
0.843
0.672
0.61
The SPSS SAV DSC has acceptable read performance for the entire data set, which makes it a suitable input data format for data management activities.
In SPSS SAV DSC has acceptable single variable read performance for ad hoc tabulation scenarios.
Write performance
The following table provides SPSS SAV DSC write performance (measured in records per second):
skidemo
UNICOM Intelligence database
UNICOM Intelligence file
Quantum
Quanvert
SPSS
Records per second
80
1897
1523
*
1971
The following table provides update performance (measured in seconds) for every value in a single weight column:
skidemo
UNICOM Intelligence database
UNICOM Intelligence file
Quantum
Quanvert
SPSS
1 variable
0.719
0.219
0.282
*
0.578
The SPSS SAV DSC has excellent write performance when exporting to the *.sav file format.
The SPSS SAV DSC has acceptable update performance when used with small to medium sized data sets. Update performance is poor against larger data sets.
Usage recommendations
The recommended SPSS SAV DSC usage is as follows:
Data export to other formats. The SPSS SAV DSC has acceptable read performance for exporting to other DSCs.
Tabulation. The SPSS SAV DSC has acceptable performance when used for most tabulation scenarios.
Portable data format. The SPSS *.sav format is supported by UNICOM Intelligence, IBM SPSS, and some other third-party products. Support across the different product suites, and the relatively small storage size, makes *.sav files a good portable data format.
Low-volume, single-user data collection. While not suitable for higher-volume data collection (>1000 cases), the SPSS SAV DSC can be used for disconnected UNICOM Intelligence applications, such as CAPI or Data Entry.
The SPSS SAV DSC is not suited in the following scenarios:
Data management. Although copying data to the SPSS SAV DSC is fast, the DSC exhibits poor performance when updating variables. This means that the creation of weights, and the addition of coded values, can be slow when using the SPSS SAV DSC. The poor update performance is due to the SPSS SAV DSC having to rewrite the entire *.sav file each time weights are created, or each time an UPDATE statement is run.
Unbounded levels data. The SPSS SAV DSC does not support unbounded loops and cannot be used to store levels data.
Issues known to impact performance
The following issues are known to impact SPSS SAV DSC performance:
Disabling scan for categories. By default, the SPSS SAV DSC scans the entire data set to find extra values that are not defined as categories. The scan process can be time-consuming for large data sets and can be disabled by creating an *.ini file for the *.sav file that includes the following line:
ScanForCategories = 0
Note The ScanForCategories setting forces the SPSS SAV DSC to silently ignore any unlabeled values in the case data.
Generating metadata in advance. When you do not use an .mdd file, the SPSS SAV DSC generates an MDM document, that contains many variables, for the .sav file. The MDM generation process can take quite a long time. To improve load times in UNICOM Intelligence Survey Tabulation and UNICOM Intelligence Reporter, it is recommended that an .mdd file is created for the .sav file in advance.
See
Data Source Components (DSCs)