Professional > Data management scripting > Data cleaning > Data cleaning examples > Example 3: More on cleaning single response data
 
Example 3: More on cleaning single response data
Example 1: More than one response to a single response question shows how to deal with case data that contains multiple responses to single response questions. The examples showed cleaning the data for each question individually by name. However, in “real life”, you often want to apply the same cleaning rules to many of the questions in a survey. This topic provides examples of two ways of doing this. These examples are followed by an example of alternately selecting the highest and lowest response when more than one response has been chosen as the answer to a rating question.
Using a loop
The following example uses a For Each...Next loop to iterate through all of the single response questions included in the job, testing whether they have more than one response, and if they do, writing the details to a report file and setting the DataCleaning.Status system variable to Needs review. (This example assumes that the report file is set up in the OnJobStart Event section and closed in the OnJobEnd Event section as shown in the previous example.)
Event(OnNextCase, "Clean the data")
Dim myQuestion, strDetails, strError
On Error Goto ErrorHandler

strDetails = CText(Respondent.Serial)

For Each myQuestion in dmgrJob.Questions
If myQuestion.QuestionType = QuestionTypes.qtSimple Then
If myQuestion.Response.DataType = mr.Categorical Then
If myQuestion.Validation.MaxValue = 1 Then
If myQuestion.AnswerCount() > 1 Then
strDetails = strDetails + myQuestion.QuestionFullName + " needs checking "
DataCleaning.Status = {NeedsReview}
End If
End If
End If
End If
Next

dmgrGlobal.mytextfile.WriteLine(strDetails)

Exit ' Success

ErrorHandler:
strError = CText(Err.LineNumber) + ": " + Err.Description
dmgrLog.LogScript_2(strError)
Debug.Log("ERROR: " + strError)
End Event

Logging(myLog)
Group = "DMGR"
Path = "c:\temp"
Alias = "Tester"
FileSize = 500
End Logging
The first line in the For Each loop tests the Question.QuestionType property to check that the question is a simple question and not a complex question like a loop or a compound. When using a For Each with the Questions object, you must always include a test for complex questions or you risk getting an error. For more information, see Example 6: Advanced cleaning example.
The next line tests the Response.DataType property to check that the response is Categorical. For more information, see the Data Management Object model in DMOM scripting reference.
This example includes error handling that writes the line number and description of any error to the log file. For more information, see Logging section.
Note This example is provided as a sample DMS file (called Cleaning2.dms) that is installed with the UNICOM Intelligence Developer Documentation Library. For more information, see Sample DMS files.
Using object collection iteration and a subroutine
The Museum example data set contains some categorical grid questions. For example, the rating grid asks respondents to say how interested they were in the various galleries in the museum. The following table shows the possible responses.
Category label
Category name
Not at all interested
Not_at_all_interested_1
Not particularly interested
Not_particularly_interested_2
No opinion
No_opinion_3
Slightly interested
Slightly_interested_4
Very interested
Very_interested_5
The grid has a multiple response categorical question (called column) that is asked once for each gallery. Each gallery is a category in the controlling category list. The grid can be considered a loop that is iterated once for each category in the controlling category list.
This example uses the mrScriptBasic Object collection iteration feature to set the default response for the column question to No opinion and validate all of the iterations of the grid.
The Validate method checks the maximum value set for each iteration of the grid and replaces the selected responses with the default response only if the number of responses is greater than the maximum value. However, the question is a multiple response question with a maximum value of 5, so the maximum value is changed by setting the Validation.MaxValue property to 1. This changes the maximum value in the script only and does not change the maximum value in the input or output metadata.
Event(OnJobStart, "Do the set up")
rating[..].Column.Response.Default = {No_opinion_3}
rating[..].Column.Validation.MaxValue=1
End Event

Event(OnNextCase, "Clean the data")
On Error Goto ErrorHandler
Dim strError

rating[..].Column.AssignDefaultValidation()

Sub AssignDefaultValidation(Iteration)
If Iteration.AnswerCount() > 1 Then
Iteration.Validation.Validate(ValidateActions.vaAssignDefault)
End If
End Sub

Exit ' Success

ErrorHandler:
strError = CText(Err.LineNumber) + ": " + Err.Description
dmgrLog.LogScript_2(strError)

End Event
This example also contains a Sub procedure and that, unlike in Visual Basic, the Question object global variable needs to be passed as a parameter to the subroutine. This is because in mrScriptBasic variables (including global variables) that are available to the main script block are not visible to functions and subroutines accessed by that script block.
This example assumes that the grid is available as a Question object called rating. This is true if SELECT * FROM vdata is used (for example, because the InputDataSource section does not include a select query). However, if a select query is included, it must include two or more of the variable instances that belong to the grid. (You cannot specify the grid itself in the select query, only the variable instances that relate to the grid. In this grid there is one variable instance for each iteration of the grid.) For example:
SelectQuery = SELECT rating[{Fossils}].Column, rating[{Birds}].Column FROM vdata
Note This example and the next one are in the Cleaning3.dms sample DMS file that is installed with the UNICOM Intelligence Developer Documentation Library. For more information, see Sample DMS files.
Selecting the highest and lowest response alternately
When two or more responses have been selected in answer to a rating question, you may sometimes want to use an alternating algorithm so that the first time this scenario is encountered, you select the response that is higher in the scale and delete any lower ones and the next time, you do the reverse. The next example shows an example of doing this for one of the iterations of the rating_ent categorical grid, which has similar responses to the rating grid described above. The example uses the SortAsc and SortDesc functions to sort the responses into ascending and descending order and select the first response in the list. A global variable is used to keep track of which one was used last time.
Event(OnJobStart, "Do the set up")
Dim Switch
Switch = 1
dmgrGlobal.Add("Switch")
Set dmgrGlobal.Switch = Switch
End Event
 
Event(OnNextCase, "Clean the data")
If rating_ent[{Fossils}].Column.AnswerCount() > 1 Then
If dmgrGlobal.Switch = True Then
rating_ent[{Fossils}].Column = rating_ent[{Fossils}].Column.SortAsc(1)
dmgrGlobal.Switch = False
Else
rating_ent[{Fossils}].Column = rating_ent[{Fossils}].Column.SortDesc(1)
dmgrGlobal.Switch = True
End If
End If
End Event
Requirements
UNICOM Intelligence Professional
See also
Data cleaning examples