Categorical variables
Single response categorical variable
Multiple response categorical variable
A categorical variable stores one or a limited number of distinct values for each case respondent. Categorical variables are generally based on questions that have a predefined list of possible responses, known as categories. For example, the age variable in the Museum sample data set stores the responses to the following question. This question has eight categories, which represent the possible responses:
Single response question
A category is a type of element. Some categorical variables contain other types of elements, such a Base element, which is used during analysis to show the total number of respondents who were asked the question on which the variable is based.
The age variable is called a single response variable because when a respondent answers the question, he or she must choose only one response from the list of categories.
In some categorical questions, the respondent can choose more than one category from the list of categories. A variable that stores the responses to this type of question is called a multiple response variable and it can store more than one response for each case respondent. Here is an example of a multiple response question:
Multiple response question
This question has a category with the text Other. If a respondent has visited museums that are not in the list of categories, they can select this category, and then write the museum names in the space provided. This type of category is called an Other Specify category. The open-ended responses to this question are stored in a text variable called an Other Specify variable that is associated with the main categorical variable. Variables that are associated with a main variable and hold additional information are called helper variables.
In a categorical variable, there is one element for each category in the question on which it is based. Sometimes categorical variables have additional elements: for example, representing the base or the mean value. These are called special elements. In UNICOM Intelligence Reporter, you can add special elements to variables for use in your tables. However, in some data sets (particularly Quanvert databases) the variables actually have these special elements built into the structure of the variable. These are called built-in special elements.
How do categorical variables store the responses?
UNICOM Intelligence Reporter accesses the data through the UNICOM Intelligence Data Model, which presents data in a consistent way regardless of the underlying data format. It is not necessary to understand how the Data Model represents the responses stored in a categorical variable when you are building tables or defining simple filters. However, you will find it helpful to understand it if you want to use the advanced features, such as using advanced expressions to define filters.
The UNICOM Intelligence Data Model presents data in a consistent way regardless of how the data is actually stored in the underlying data format. The UNICOM Intelligence Data Model assigns a unique numeric (integer) value to each unique category full name in the data set. These unique values are called mapped category values. Category full names must be unique within a question, but the same full name can be used in different questions. For example, categories called Yes and No can be used in several questions, and will have the same mapped value in each one.
By default, the UNICOM Intelligence Data Model presents the responses to a categorical question as a string, in which the mapped values are formatted in { } (braces) and separated by , (commas). For example, the response to a single response question might be {24} and the response to a multiple response question might be {31,36,43}, where 24, 31, 36, and 43 are the mapped values of the chosen categories. However, if a metadata source is available, the UNICOM Intelligence Data Model can also present the responses using the category names rather than the mapped values. The example responses might then appear as {female} and {dinosaurs,insects,human_biology}.
When you refer to specific responses in, for example, a filter expression, use the category names and not the mapped values:
Use the category names | Rather than the mapped values |
---|
gender = {female} | gender = {24} |
remember = {dinosaurs,insects,human_biology} | remember = {31,36,43} |
See also