Server User Guides > Survey Tabulation > Understanding variables > Categorical variables
 
Categorical variables
Single response categorical variable
This graphic is described in the surrounding text.
Multiple response categorical variable
This graphic is described in the surrounding text.
A categorical variable stores one or a limited number of distinct values for each case respondent. Categorical variables are generally based on questions that have a predefined list of possible responses, known as categories. For example, the age variable in the Museum sample data set stores the responses to the following question. This question has eight categories, which represent the possible responses:
Single response question
This graphic is described in the surrounding text.
A category is a type of element. Some categorical variables contain other types of elements, such a Base element, which is used during analysis to show the total number of respondents who were asked the question on which the variable is based.
The age variable is called a single response variable because when a respondent answers the question, he or she must choose only one response from the list of categories.
In some categorical questions, the respondent can choose more than one category from the list of categories. A variable that stores the responses to this type of question is called a multiple response variable and it can store more than one response for each case‑respondent. Here is an example of a multiple response question:
Multiple response question
This graphic is described in the surrounding text.
This question has a category with the text Other. If a respondent has visited museums that are not in the list of categories, they can select this category and write the museum names in the space provided. This type of category is called an Other Specify category and the open-ended responses to this question are stored in a text variable called an Other Specify variable that is associated with the main categorical variable. Variables that are associated with a main variable and hold additional information are called helper variables.
In a categorical variable, there is one element for each category in the question on which it is based. Sometimes categorical variables have additional elements: for example, representing the base or the mean value. These are called special elements. In UNICOM Intelligence Reporter - Survey Tabulation, you can add special elements to variables for use in your tables. However, in some data sets (particularly Quanvert databases) the variables actually have these special elements built into the structure of the variable. These are called built-in special elements.
How do categorical variables store the responses?
UNICOM Intelligence Reporter - Survey Tabulation accesses the data through the UNICOM Intelligence Data Model, which presents data in a consistent way regardless of the underlying data format. It is not necessary to understand how the Data Model represents the responses stored in a categorical variable when you are building tables or defining simple filters. However, it is helpful to understand it if you use some of the advanced features, such as using advanced expressions to define filters.
The UNICOM Intelligence Data Model presents data in a consistent way regardless of how the data is actually stored in the underlying data format. The UNICOM Intelligence Data Model assigns a unique numeric (integer) value to each unique category full name in the data set. These unique values are called mapped category values. Category full names must be unique within a question, but the same full name can be used in different questions. For example, categories called Yes and No can be used in several questions, and will have the same mapped value in each one.
By default, the UNICOM Intelligence Data Model presents the responses to a categorical question as a string, in which the mapped values are formatted in {} (braces) and separated by , (comma). For example, the response to a single response question might be {24} and the response to a multiple response question might be {31,36,43}, where 24, 31, 36, and 43 are the mapped values of the chosen categories. However, provided a metadata source is available, the UNICOM Intelligence Data Model can also present the responses using the category names rather than the mapped values. The example responses might then appear as {female} and {dinosaurs,insects,human_biology}.
When you refer to specific responses in, for example, a filter expression, you should normally use the category names and not the mapped values. The following table provides examples of doing this.
Use the category names
Rather than the mapped values
gender = {female}
gender = {24}
remember = {dinosaurs,insects,human_biology}
remember = {31,36,43}
See also
Understanding variables