Reporter > Using variables > Simple categorization
 
Simple categorization
Use the Simple Categorization dialog box to perform simple, non-linguistic categorization of variables by converting Text, Date or Numeric variables, which cannot be directly used in table tabulation, to Categorical variables. This allows data to be analyzed for reporting purposes.
To open the Simple Categorization dialog box
1 Select the variables, and then choose Variables > Categorize > Simple from the menu.
The Simple Categorization Filter dialog box displays, allowing you to select which variables will be categorized.
Categorize all text variables. This is the default setting. When selected, all text, date and numeric variables are selected for categorization, regardless of which variables are currently selected.
Categorize selected variables. When selected, only selected variables are categorized. Variables that are not text, date, or numeric are automatically filtered out
Categorize all (text/date/numeric) variables. When selected, all text variables are selected for categorization, regardless of which variables are currently selected.
Update existing variables. When selected, the Simple Categorization dialog box opens and lists all existing, categorized variables.
Create new variables. When selected, the Simple Categorization dialog box opens with no defined categorized variables. You are required to create new variables.
2 Select a categorization filter, and then click OK. The Simple Categorization dialog box displays.
3 Set the properties that are described below, and then click OK. The new categorical variables are created and displayed in the variable list based on the defined New variable name.
Fields on the Simple Categorization dialog box
Selected variables
Displays the name of original variables coupled with the new variable names, such as OldVarName > NewVarName. By default the first variable in the list is selected and the name of its generated variable is shown in the New variable name field. If the new variable name is already in use, you are warned that the existing variable will be overwritten. The list supports multiple selections using CTRL + left click. When multiple variables are selected:
The New variable name field is not available.
The Category description format field is available.
The full value, Maximum number of categories, and Generate the “Other” category for any uncategorized data options are not available when more than one variable type is selected.
The Custom categorization expression field is not available.
New variable name
Displays the derived variable name for the selected variable. You can enter an appropriate variable name, or choose to accept the default name. This option is only enabled when you select Create new variables on the Simple Categorization Filter dialog box.
Category description format (optional)
Allows you to format the coded category description, where {Value} is the placeholder for the coded category. For example, if you type:
This is text before - <b>{Value}</b> - and this is text after
then each coded category description would follow this format.
Maximum number of categories
When selected, this user defined setting limits the number of categories generated for each variable.
Generate the "Other" category for any other uncategorized data
When selected, an "Other" directory is created to include any data not included in the generated categories. This option is not limited by the value defined for Maximum number of categories.
Treat empty values as
The following option are enabled by default.
Not asked (NULL). When selected, empty and null values are ignored from base.
User-missing category. When selected, a category is created for empty and null values, allowing users to set a category label.
Categorization based on
The full value. When selected, each unique, full value of case data will be a category. This is the default setting.
The first characters. When selected, the user-defined number of characters from the left side of each text value are trimmed, and each unique substring becomes a category.
The last characters. When selected, the user-defined number of characters from the right side of each text value are trimmed, and each unique substring becomes a category.
Ignore case. When selected, data case is not considered. For example, the two data entries “ABC” and “abc” become one category. This option is enabled by default.
Custom categorization expression. Enables you to edit the variable expression manually. This field is not available when multiple variables are selected.
More
Displays the advanced Categorization based on options.
Less
Hides the advanced Categorization based on options.
Reset
Resets all variables in the Selected variables list back to their default settings.
Preview
Previews the generated categorical variables, one-by-one, in the UNICOM Intelligence Reporter preview dialog.
See also
Using variables