Professional > Interview scripting > Writing interview scripts > Keywords for data analysis > Categorical bands for numeric, text, and date responses
 
Categorical bands for numeric, text, and date responses
Numeric, text, and date questions store the exact value that each respondent gives as their answer. Although this makes for very accurate data, it is not always so useful in analysis where you may want to use the question as the rows or columns of a table.
When you use a numeric variable as the rows or columns of a table, the analysis application displays five elements: base, mean, minimum, maximum, and standard deviation, but what may be more useful is to group the numeric values into ranges or bands so that the data can be tabulated in the same way as categorical data. For example, in an age question you might have age ranges 18 to 24, 25 to 34, 35 to 44, and so on. The situation is worse for text and date questions which cannot be tabulated at all in their raw form.
The easiest way to satisfy the requirements of accurate data and easy analysis is to create all numeric, text, and date questions with an axis block defining the elements you want to see in analyses. Here’s the age example we talked about.
Age "How old are you?" long [18..99]
axis(
"{
A24 '18 to 24' expression('Age <= 24'),
A34 '25 to 34' expression('Age >= 25 And Age <= 34'),
A44 '35 to 44' expression('Age >= 35 And Age <= 44'),
A54 '45 to 54' expression('Age >= 45 And Age <= 54'),
A64 '55 to 64' expression('Age >= 55 And Age <= 64'),
A65 '65 plus' expression('age >= 65'),
AgeMean 'Mean' mean(Age),
AgeStddev 'Std.dev' stddev(Age)
}"
);
Even though the age ranges look like categorical specifications, the analysis program still has access to respondents' exact ages and is able to calculate the mean and standard deviation using the raw data. There is no need to define factors as would be necessary if the ranges were truly categorical.
Example for a text question
Postcode "What is your full postcode?" text [1..9]
axis (
  "{
  TN8 'TN8' expression('Postcode.Find(""tn8"") <> -1'),
  TN13 'TN13' expression('Postcode.Find(""tn13"") <> -1'),
  TN14 'TN15' expression('Postcode.Find(""tn14"") <> -1'),
  TN16 'TN16' expression('Postcode.Find(""tn16"") <> -1'),
  Othpc 'Invalid/other' expression('Postcode.Find(""tn8"") = -1
                                 And Postcode.Find(""tn13"") = -1
                                 And Postcode.Find(""tn14"") = -1
                                 And Postcode.Find(""tn16"") = -1
                                  Or Postcode.IsEmpty()')
  }"
);
In this example, respondents supply their full postcodes, which are then grouped by area to form a summary table showing how respondents are distributed across the region in which the survey ran. The expression for each area code scans the Postcode data looking for a given string. If the string is not found, the Find function returns -1 (if the string is found, the function returns its position in the text, but that is not important here). The last element counts anyone whose postcode did not fall into the preceding groups and also anyone who did not answer the question.
Example for a date question
RenewDate "What is the renewal date for your travel insurance?"
date [1-Jan-07..31-Dec-07]
axis (
"{
Q12007 'Quarter 1 2007' expression('RenewDate.DatePart(""q"") = 1'),
Q22007 'Quarter 2 2007' expression('RenewDate.DatePart(""q"") = 2'),
Q32007 'Quarter 3 2007' expression('RenewDate.DatePart(""q"") = 3'),
Q42007 'Quarter 4 2007' expression('RenewDate.DatePart(""q"") = 4')
}"
);
This example groups responses into quarters by checking the DatePart function to convert the months into quarters. Therefore, someone whose travel insurance is due for renewal in July, August, or September will be placed in the Quarter 3 2007 element. See DatePart for more information.
See also
Keywords for data analysis