Common data management task guidelines
The following sections describe common data management tasks that are run in the
OnNextCase event section. The tasks are based on the review results from several different scripts. Many of the tasks can be simplified by using the
Data Management functions. The data management functions offer improved performance over the following scripted examples.
Note UNICOM Intelligence tabulation features reduce the need for many of the following data management tasks.
Copying grids
For reporting purposes, it is sometimes necessary to copy a subset of a grid to a new grid, or to copy an iteration of a nested grid to a top-level grid. In most cases, the destination grid defines the data subset that needs to be copied. To demonstrate a typical example where grid copying is required, the following nested grid asks attribute questions for each brand in the brand list:
BrandAttributes "Brand Attributes" loop
{
use Brands -
} fields -
(
Attributes "Attributes" loop
{
AttributeList use \\.AttributeList -
} fields -
(
Agree "Agree"
categorical [1..1]
{
AgreeList use \\.AgreeList -
};
) expand;
) expand;
For reporting purposes, the only area of concern is tabulating brand 'a' and only for a subset of the attributes. For example:
Brand_a_Attributes "Brand A Attributes" loop
{
AttributeList use \\.AttributeSubList -
} fields -
(
Agree "Agree"
categorical [1..1]
{
AgreeList use \\.AgreeList -
};
) expand;
To copy the grid, a function similar to the following example can be created in the OnNextCase event:
CopyGrid(BrandAttributes[{a}].Attributes, Brand_a_Attributes)
Sub CopyGrid(Source, Destination)
' Based on the metadata defined for the destination grid, copy the
' data from the source grid to the destination grid.
' NOTE: This function assumes that the destination grid has the same
' definition as, or is a subset of, the source grid.
Dim Category, IterationName, SourceIteration, DestIteration, _
SourceQuestion, DestQuestion
' Get the source and destination iterations based on the categories
' defined for the destination grid
For Each Category in Destination.Categories
IterationName = Category.Name
Set SourceIteration = Source[IterationName]
Set DestIteration = Destination[IterationName]
' Get the nested questions for each iteration based on the
' nested questions defined for the destination grid
For Each DestQuestion in DestIteration
Set SourceQuestion = SourceIteration[DestQuestion.QuestionName]
If DestQuestion.QuestionType = _
QuestionTypes.qtLoopCategorical Then
' Allow nested categorical grids to be copied
CopyGrid(SourceQuestion, DestQuestion)
ElseIf DestQuestion.QuestionType <> QuestionTypes.qtSimple Then
Err.Raise(1, "CopyGrid", _
"Only nested categorical loops and simple " + _
"questions are supported by CopyGrid")
ElseIf DestQuestion.QuestionDataType = mr.Categorical Then
' Copy just the categories defined in the destination grid
DestQuestion = SourceQuestion * _
DestQuestion.DefinedCategories()
Else
' Copy the question values
DestQuestion = SourceQuestion
End If
Next
Next
End Sub
Note The
CopyGrid data management function can be used in scenarios where copying grids is required. Similar to the previous script function, the
CopyGrid function uses the destination grid definition to control what is copied.
Grid creation
When a survey contains a number of different questions that share a response list (for example, separate satisfaction questions across a range of different services), the results are often reported as a single grid. In this scenario, data is collected as separate questions instead of in a loop or grid, but it must be tabulated as if it is a grid. This issue is especially relevant when the data to be tabulated originates outside of UNICOM Intelligence, where loop constructs are not available.
In the following example, 4 brand rating questions are tabulated as a grid:
BrandA "Rating A" categorical [1..1] { use Rating - };
BrandB "Rating B" categorical [1..1] { use Rating - };
BrandC "Rating C" categorical [1..1] { use Rating - };
BrandD "Rating D" categorical [1..1] { use Rating - };
NewGrid "Brands grid" loop
{
BrandA "BrandA",
BrandB "BrandB",
BrandC "BrandC",
BrandD "BrandD"
} fields -
(
Rating "Rating" categorical [1..1]
{
Rating use \\.Rating -
};
) expand;
A simple script can be used to copy the data from the top level questions to the grid. For example:
NewGrid[{BrandA}].Rating = BrandA
NewGrid[{BrandB}].Rating = BrandB
NewGrid[{BrandC}].Rating = BrandC
NewGrid[{BrandD}].Rating = BrandD
Writing similar script becomes repetitive and error prone when many different grids need to be created. The task can be simplified if it is assumed that the grid iteration's category name matches the question name at the top level. For example, the following script copies the question responses into a grid:
CopyGridData(NewGrid, Questions)
Sub CopyGridData(Grid, Questions)
' Copies flattened data to a grid
' NOTE: This function assumes that each grid category
' name matches the question name to be copied
Dim Category, SliceName, Question, Slice
For Each Category in Grid.Categories
SliceName = Category.Name
' Get the question for the slice
Set Question = Questions[SliceName]
' Assign the first nested variable in the grid
Set Slice = Grid[SliceName].Item[0]
Slice = Question * Slice.DefinedCategories()
Next
End Sub
The UNICOM Intelligence tabulation products provide options for creating derived grids for a selection of variables. Derived grid creation avoids the need to create grids for related questions in the data management script, which improves data management performance. Although a derived grid is slower to tabulate than a non-derived grid, there is still an overall saving of more than 25%. Otherwise, a derived grid is no different than a non-derived grid. Both grid types share tabulation options (such as column proportion statistics).
Note The categories that are defined for the grid's nested question are used to filter data from the top level questions (* Slice.DefinedCategories()). The grid can be defined with the required category subset when only a subset of the categories is required in the new grid.
Flattening grids
In some instances, grid slices must be flattened as top level variables. Grid flattening is most often done to simplify reporting on key brands or attributes. The easiest way to flatten grid iterations is to define new variables in the metadata section of the data management script.
For example, the following variable definition can be used to create a top-level variable for brand 'a':
NewBrandA "Brand A"
categorical [1..1]
expression("BrandGrid[{a}].Rating");
In this example, the category list is derived from the expression. In this scenario, categories for the grid slice are defined for the variable. If only a subset of the categories is required in the top-level variable, the variable can be defined with the required category subset:
NewBrandA "Brand A"
categorical [1..1]
{
use KeyRatings -
}
expression("BrandGrid[{a}].Rating * NewBrandA.DefinedCategories()",
NoDeriveElements);
Notes
▪In the example, categories that are defined for the top-level question are used to filter data from the grid slice (* NewBrandA.DefinedCategories()). The NoDeriveElements keyword is also specified, which prevents categories from being derived from the expression.
▪Derived variables, which are defined in the Metadata section of a data management script, are converted to data variables as part of the data management process.
▪The
FlattenGrid data management function can be used to simplify the metadata creation task for the new top-level variables.
▪It is possible to reference grid slices in expressions without the need to use the up-lev syntax.
▪It is possible to reference the grid slice in the filter expression. For example:
BrandGrid[{BrandA}].Rating.ContainsAny({Great,Good})
Support for referencing grid slices directly in filter and new variable expressions simplifies the creation of complex filters that are based on grids. The need to flatten grids is also reduced.
Flipping grids
When reporting on grids, it is often necessary to modify the category list that defines the grid iterations. For example, when the grid iterations are a brand list, the tables output might require certain brands to be combined or removed. You might also want to add extra elements, such as means. Flipping a grid results in grid iterations that become answers, and answers that become iterations.
To demonstrate what is meant by flipping grids, consider the following grid:
BrandGrid "Please rate the following brands" loop
{
use Brands -
} fields -
(
Rating "Rating"
categorical [1..1]
{
Rating use \\.Rating -
};
) expand grid;
The grid variable instances might be answered as follows:
BrandGrid[{a}].Rating = {Great}
BrandGrid[{b}].Rating = {Good}
BrandGrid[{c}].Rating = {Great}
BrandGrid[{d}].Rating = {Poor}
BrandGrid[{e}].Rating = {Average}
BrandGrid[{f}].Rating = {Good}
When the grid is flipped, the flipped grid metadata resembles the following example:
RatingGrid "Rating" loop
{
use Rating -
} fields -
(
Brands "Brand"
categorical
{
BrandList use \\.Brands -
};
) expand;
Based on the provided answers, the variable instances for the flipped grid are as follows:
RatingGrid[{Great}].Brand = {a,c}
RatingGrid[{Good}].Brand = {b,f}
RatingGrid[{Average}].Brand = {e}
RatingGrid[{Poor}].Brand = {d}
Flipping a grid in this manner allows the brands list to be modified when tables are created in UNICOM Intelligence Reporter. To flip the grid, a routine can be created in the OnNextCase event. For example:
FlipGrid(BrandGrid, RatingGrid)
Sub FlipGrid(Source, Destination)
' Based on the categories defined for the destination grid, flip the
' iterations of the source grid.
' NOTE: This function assumes that the grids only contain 1 nested question
Dim Category, Answer, SourceSlice, DestSlice
For Each Category in Destination[0].Item[0].Categories
' Get the source slice
Set SourceSlice = Source[Category.Name].Item[0]
If SourceSlice Is Not Null Then
' Update the iterations for each answer in the source slice
For Each Answer in SourceSlice
Set DestSlice = Destination[CCategorical(Answer)].Item[0]
DestSlice = DestSlice + CCategorical(Category)
Next
End If
Next
End Sub
Notes
▪The categories that are defined for the nested question in the destination grid are used when the grid is flipped. When only a subset of the iterations is required in the flipped grid, the destination grid can be defined with the required category subset.
▪Because It is possible to modify the element list that defines the grid or loop iterations, in most cases it is no longer necessary to flip grids for reporting purposes.
▪The
FlipGrid data management function can be used in scenarios where flipping grids is still required. The
FlipGrid function is based on the previous script example.
Grid summarization
Grid summarization refers to the process of summarizing a grid at a higher level based on a certain criteria. Summarization can be the creation of a summary variable from a simple grid, or the creation of a summary grid from a nested grid.
You might want to create a summary variable for a grid that contains the brands that have a top 2 rating. For example:
Brands "Brand List" define
{
a "a",
b "b",
c "c",
d "d",
e "e",
f "f"
};
Rating "Rating Scale" define
{
Great "Great",
Good "Good",
Average "Average",
Poor "Poor"
};
BrandGrid "Please rate the following brands" loop
{
use Brands -
} fields -
(
Rating "Rating"
categorical [1..1]
{
Rating use \\.Rating -
};
) expand grid;
BrandTop2 "Top 2 summary variable for the brands grid"
categorical
{
use brands -
};
When the grid is answered as follows, the value of BrandTop2 is {a, b, c, f}.
Grid answers
|
Great
|
Good
|
Average
|
Poor
|
a
|
✓
|
|
|
|
b
|
|
✓
|
|
|
c
|
✓
|
|
|
|
d
|
|
|
|
✓
|
e
|
|
|
✓
|
|
f
|
|
✓
|
|
|
Notes
▪The summary value is Null (not in the base) when none of the grid iterations are answered. The summary value is empty {} (included in the base) when at least one iteration is answered (no iterations are included in the summary variable).
▪The categories that are defined for the output summary variable are used when the grid is summarized. The summary variable can be defined with the required category subset when only a subset of the grid iterations is required in the summary variable.
▪The
CreateGridSummary data management function is based on the previous script example.
There are occasions where a nested grid needs to be summarized as a grid. For example, you might want to summarize the following nested brand attributes rating grid as a grid that lists the top 2 attributes for each brand:
BrandAttributesAgree "Brand attributes - agree" loop
{
use Brands -
} fields -
(
Attributes "Attributes" loop
{
AttributeList use \\.AttributeList -
} fields -
(
Agree "Agree"
categorical [1..1]
{
AgreeList use \\.AgreeList -
};
) expand;
) expand;
Creating a grid summary for a nested grid is just an extension of the previous example. Each grid within the nested grid is summarized to the corresponding slice in the summary grid by using CreateGridSummaryFunction. For example:
CreateNestedGridSummary(BrandAttributesAgree, BrandAttributes, _
{StronglyAgree, Agree})
Sub CreateNestedGridSummary(NestedGrid, SummaryGrid, SummaryCategories)
' Based on the categories defined for the summary grid, create a summary
' of the iterations in the nested grid which include a specified answer.
Dim Category, NestedGridSlice, SummaryGridSlice, SliceName
For Each Category in SummaryGrid.Categories
' Use the first nested variable in the nested grid and grid
SliceName = Category.Name
Set NestedGridSlice = NestedGrid[SliceName].Item[0]
Set SummaryGridSlice = SummaryGrid[SliceName].Item[0]
' Create a summary of the slice
CreateGridSummary(NestedGridSlice, SummaryGridSlice, _
SummaryCategories)
Next
End Sub
Data transformation
A common task in DMS is to map or transform values from one variable definition to another. The most basic example of data mapping is to transform distinct values to other values. For example, the following the Designated Market Area (DMA) code is mapped to a category:
Select Case DMA
Case 524
Geography = {Atlanta}
Case 501
Geography = {NYC}
Case 807
Geography = {SF}
Case 528
Geography = {Miami}
Case 539
Geography = {Tampa}
Case Else
Geography = {Rest}
End Select
Note The
Decode function is used for simple value mapping. For example, the same mapping script can be written by using the Decode function:
Geography = DMA.Decode( 524, {Atlanta}, _
501, {NYC}, _
807, {SF}, _
528, {Miami}, _
539, {Tampa}, _
{Rest})
The disadvantage of the Decode function is that it relies heavily on layout. However, it does result in less script that is also easier to read (when good layout is used). The use of the Decode function also results in close to a 1% performance improvement in the bench-mark script.
An alternative approach to the mapping of values to categories is to store the value that is associated with each category as part of the category metadata. For example, a custom property or the factor can be used to store the value. For example:
Geography "Geography"
categorical [1..1]
{
NYC "NYC" factor(501),
SF "SF" factor(807),
Miami "Miami" factor(528),
Tampa "Tampa" factor(539),
Rest "Rest"
};
In the OnNextCase event, the variable is assigned by searching for the specified factor. For example:
Dim Category
For Each Category in Geography.Categories
If DMA = Category.Factor Then
Geography = Category
Exit For
End If
Next
If Geography Is Null Then
Geography = {Rest}
End If
The advantage of this approach is that the categories, and their associated values, are defined in one place. New categories and values can be added without needing to update the OnNextCase script. The script that finds an object with a specified property can also be generalized. For example:
DMA.Ask()
Geography = Geography.Categories._
FindObject("Factor", DMA, {Rest})._
CCategorical()
Function FindObject(Collection, PropertyName, Value, ElseValue)
Dim Item
For Each Item in Collection
If Eval("Item." + PropertyName) = Value Then
Set FindObject = Item
Exit Function
End If
Next
Set FindObject = ElseValue
End Function
See