Creating a categorical derived variable
We've now seen how to create numeric and Boolean dynamically derived variables. But how do you create a categorical derived variable? A categorical variable has more than one category, so how can you describe a derived categorical variable with just a single expression? The answer, quite simply, is that you can't. However, you can set up an expression for each category in a derived categorical variable.
The next example creates a dynamically derived categorical variable called YoungPeople. It has two categories, YoungMen and YoungWomen. It uses the age and gender variables so that YoungMen is the response for respondents whose gender is male and age is between 11 and 20, and YoungWomen is the response for respondents whose gender is female and age is between 11 and 20.
Here's the mrScriptBasic code:
' The output metadata document (.mdd) file
#define OUTPUTMDM "C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\museum_dd_expression2.mdd"
' Copy the museum.mdd sample file so that we
' do not update the original file...
Dim fso, f
Set fso = CreateObject("Scripting.FileSystemObject")
fso.CopyFile("C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Data\Data Collection File\museum.mdd", _
OUTPUTMDM, True)
' Make sure that the read-only attribute is not set
Set f = fso.GetFile(OUTPUTMDM)
If f.Attributes.BitAnd(1) Then
f.Attributes = f.Attributes - 1
End If
Dim MyDocument, MyDynamicallyDerivedVariable, MyExpression
' Create the MDM object and open the Museum .mdd file
' in read-write mode
Set MyDocument = CreateObject("MDM.Document")
MyDocument.Open(OUTPUTMDM, , MDMLib.openConstants.oREADWRITE)
' Add the derived variable to the MDM Document
Set MyDynamicallyDerivedVariable = _
MyDocument.CreateVariable("YoungPeople", "Young People")
MyDocument.Fields.Add(MyDynamicallyDerivedVariable)
MyDynamicallyDerivedVariable.DataType = mr.Categorical
' For categorical variables there's usually more than one category, so
' use Expressions and not Expression
MyDynamicallyDerivedVariable.SourceType = _
MDMLib.SourceTypeConstants.sExpressions
' Using a subroutine, set up the derived categories
MyExpression = "(age = {E1116_years} AND gender = {Male}) OR " + _
"(age = {E1720_years} AND gender = {Male})"
CreateDerivedCategory(MyDocument, MyDynamicallyDerivedVariable, _
"YoungMen", "Young Men", MyExpression)
MyExpression = "(age = {E1116_years} AND gender = {Female}) OR " + _
"(age = {E1720_years} AND gender = {Female})"
CreateDerivedCategory(MyDocument, MyDynamicallyDerivedVariable, _
"YoungWomen", "Young Women", MyExpression)
' Save and close the MDM Document
MyDocument.Save()
MyDocument.Close()
Exit
Sub CreateDerivedCategory(Document, _
DynamicallyDerivedVariable, _
CategoryName, CategoryLabel, CategoryExpression)
Dim MyCategory
' Create a new category
Set MyCategory = Document.CreateElement(CategoryName, CategoryLabel)
MyCategory.Expression = CategoryExpression
DynamicallyDerivedVariable.Elements.Add(MyCategory)
End Sub
Notice that the Sub CreateDerivedCategory() is called twice. The first time to create the YoungMen category and the second time the Sub is called to create the YoungWomen category.
After running this example code, you can use
DM Query to see this new dynamically derived variable in action. For step-by-step instructions on setting up DM Query to run the queries, see
How to run the example queries in DM Query using the museum sample, but make sure that you select
museum_dd_expression2.mdd instead of
museum.mdd.
Now that you are running DM Query and have opened the modified .mdd file, you are ready to start. Enter this query in the SQL box:
select serial, age, gender, YoungPeople from vdata
DM Query returns four columns of data. Here are the results for the first 12 cases:
serial age gender YoungPeople
1 {e3544_years} {male} {}
2 {e2534_years} {female} {}
3 {e2534_years} {female} {}
4 {e2534_years} {male} {}
5 {e3544_years} {female} {}
6 {e1720_years} {male} {youngmen}
7 {e2534_years} {female} {}
8 {e3544_years} {male} {}
9 {e4554_years} {female} {}
10 {e2534_years} {female} {}
11 {e2124_years} {male} {}
12 {e1116_years} {female} {youngwomen}
Note that case 6 shows a respondent who is a young man and case 12 a respondent who is a young woman. You could also use SQL queries to determine the SUM of YoungPeople, and so on, because from a Case UNICOM Intelligence Data Model perspective YoungPeople is a variable like any other variable.
You can use the same technique to create dynamically derived multiple response variables. For example, if we had defined the expressions so that the expression for the first category and the expression for the second category could both evaluate to True for the same case, then we would have defined a dynamically derived multiple response variable. DM Query would return the results of a query on a dynamically derived multiple response variable in the same way as it would for any other multiple response variable.
See also