Base calculation
Sample script file: BaseElements.mrs
When calculating the base, UNICOM Intelligence Professional includes every case for which the case data stored in the variable is not Null. A value of Null is a special marker that indicates that the value is not known and generally indicates that the question on which the variable is based was not asked. A value of Null is different from an empty or zero value.
When a respondent is asked a categorical or open-ended question but for some reason does not answer, the case data generally stores an empty categorical value ({}) or an empty string ("") respectively (although some questions have one or more special categories to indicate that the respondent did not provide an answer). Consequently, for categorical and text data, it is possible to distinguish between a question that was asked but not answered and one that was not asked at all. However, in numeric data it is not possible to distinguish questions that were asked but not answered from those that were not asked at all, because the Data Model currently stores a Null value for both.
In a simple survey where a case corresponds to a respondent, the base generally includes every respondent who was asked the question on which the variable is based, regardless of whether he or she actually answered it or not.
When you create an axis based on a subset of the elements in a variable, the base is the same as when you select all of the elements. To illustrate this, we will use the signs variable in the Museum XML data set to create an unfiltered one-dimensional table:
TableDoc.Tables.AddNew("Signs", "signs", "Unmodified Signs variable")
Here is the table:
Notice that the base is 298, which is the sum of the counts in the three categories. This is a single response variable and all of the respondents who were asked the question answered it. Note that if any of the respondents had not answered the question, they would be included in the base too. Now let's modify the table to include only the first two categories:
TableDoc.Tables.AddNew("SignsNew", "signs{Yes, No}", _
"Two elements of the Signs variable")
Here is the table:
One dimensional table showing signs on side axis; including only Yes and No responses, base unfiltered
Notice that the base is still 298, but it no longer represents the sum of the counts in the categories on the table. This is because the base represents the number of respondents who were asked the question and is not based on the counts in the categories that have been selected for inclusion on the table. If you want the base to reflect only the respondents who selected the categories that are shown on the table, you would need to use a filter. For example, you could use the following filter to exclude respondents from the table who did not choose either of the two categories that are shown:
TableDoc.SignsNew.Filters.AddNew("ExcludeCases", _
"signs.ContainsAny({Yes, No})")
Here is the table after applying the filter:
Notice that the base is now 271, which is the sum of the counts in the two categories that are shown on the table.
An alternative would be to create a derived variable based on the
Signs variable, but containing only the
Yes and
No categories, and use the variable to create the table. The autobase would then include the
Yes and
No categories only and there would be no need to filter the table. For more information, see
Creating built-in base elements.
Now suppose we want to add a mean element based on the visits numeric variable to the axis in the unfiltered table:
TableDoc.Tables.AddNew("SignsAndVisits", "signs{Yes, No, " + _
"meanvisits 'Average number of visits' mean(visits) }", _
"Modified Signs variable with mean of Visits")
Here is the table:
The visits variable is a numeric variable, which means that it stores a Null value for respondents who were not asked or did not answer the question on which it is based. In the Museum sample, the visits variable stores a Null value for some of the respondents who are included in the table. When UNICOM Intelligence Professional calculates the base used by the mean value calculation, it includes only respondents who are included in the table base and for whom the numeric variable does not store a Null value.
The following table lists some hypothetical responses and shows whether the case is included in the base for the axis and the base for the mean element.
Case | Value in signs variable | Value in visits variable | Included in axis base in unfiltered table | Included in base for mean element |
---|
1 | {Yes} | 4 | Yes | Yes |
2 | {No} | Null | Yes | No |
3 | {Dont_Know} | 5 | Yes | Yes |
4 | Null | 2 | No | No |
5 | {} | Null | Yes | No |
Note When working with the hierarchical view of the data, empty levels are considered to be Null and are not counted in the base. See the seventh example table in
Understanding population levels for more information.
See also