Case data filter
In a large study, you may want to export the case data for each region separately, so that it can be analyzed separately. You can achieve this by defining a case data filter that selects the cases for which the region variable has a specified value.
In the DMS file, you define a case data filter of this type using a WHERE clause in the select query defined for the input data source. For example, this InputDataSource section defines a case data filter (shown in red) that selects cases for which the South category is selected for the region variable:
InputDataSource(myInputDataSource, "My input data source")
ConnectionString = ...
SelectQuery = "SELECT * FROM vdata WHERE region = {South}"
End InputDataSource
For more information on defining case data filters using the WHERE clause, see
Basic SQL queries.
Tips
▪You can use the UNICOM Intelligence Professional Metadata Viewer to view the variable and category names. For more information, see
Using the metadata viewer.
▪If you are working with UNICOM Intelligence Interviewer - Server data and want to filter out test, active, and timed out records, or select records based on the date they were collected, you may find it easier to set up the case data filter in WinDMSRun. For more information, see
WinDMSRun window: Case Selection.
When cleaning data in an ongoing study, you may want to set up a case data filter to include only cases that have not been cleaned before. One way of doing this is to use a global SQL variable. For more information, see
GlobalSQLVariables section. In your cleaning script, you may also want to specify that some cases are to be excluded; for example, because they contain questionable responses. In the DMS file, you can define this type of case data filter using the
dmgrJob.DropCurrentCase method in the OnNextCase Event section. Here is an example:
Event(OnNextCase)
If age < 0 Then dmgrJob.DropCurrentCase() .
.
.
End Event
You can also use an update query in the OutputDataSource section to remove case data records from the output data source. For example, the following snippet shows how you could use update queries in two OutputDataSource sections to split the case data between two output data sources: one of which stores the clean data and the other the dirty data.
OutputDataSource(Clean, "My clean data")
ConnectionString = "Provider=mrOleDB.Provider.2; _
Data Source=mrDataFileDsc; _
Location=C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\CleanData.ddf"
MetaDataOutputName = "C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\CleanData.mdd"
UpdateQuery="DELETE FROM vdata _
WHERE DataCleaning.Status = {NeedsReview}"
End OutputDataSource
OutputDataSource(Dirty, "My dirty data")
ConnectionString = "Provider=mrOleDB.Provider.2; _
Data Source=mrDataFileDsc; _
Location=C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\DirtyData.ddf"
MetaDataOutputName = "C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\DirtyData.mdd"
UpdateQuery="DELETE FROM vdata _
WHERE NOT DataCleaning.Status.ContainsAny({NeedsReview})"
End OutputDataSource
This example is in the
SplitIntoDirtyAndClean.dms sample. For more information, see
Sample DMS files.
See also