Desktop User Guides > Professional > Interview scripting > Writing interview scripts > Keywords for data analysis > Statistical elements > Statistics using analysis elements > Creating statistics using analysis elements and raw data
 
Creating statistics using analysis elements and raw data
When you have a derived variable who responses are based on numeric data you can use the raw data to produce accurate statistics. You specify these elements in the same way that you do for statistics based on factors, but because the data used in the calculation of the statistics is coming from a different question or variable you must also define some hidden elements that contain intermediate data that is required by your statistics.
For example, means are calculated by taking the sum of the numeric values (known as the sum-of-x) and dividing it by the number of cases or respondents (known as the sum-of-n), while standard deviations require the same information as well as the sum of the squared numeric values (sum-of-x-squared). These are all additional elements that you must define as part of the question.
Syntax
Put one of the following statements in the response list for each additional element required. In the case of a mean you would type two statements, and for a standard deviation you would type three. The statement has been written over three lines and shows the points at which line breaks are allowed.
Name
[CalculationType=Type, Hidden=True, ExcludedFromSummaries=True, DataElement=""]elementType(AnalysisSummaryData)
[multiplier(use QName)]
Parameters
Name
The element’s name. This must be one of SumN, SumX, or SumXSquared.
Type
The type of calculation to be stored in this element. Refer to the table for details.
QName
The name of the numeric variable whose raw data is to be used in the calculation. The multiplier parameter is not required for means.
Addiional elements
The following table shows which calculation types are required for each of the common statistics:
Statistic
Additional elements required, and in what order
Mean
SumX, SumN
Standard deviation
SumXSquared, SumX, SumN
Standard error
SumXSquared, SumX, SumN
Sample variance
SumXSquared, SumX, SumN, SumUnweightedN
Rules for defining the additional elements are as follows:
You must define these elements in the order they are listed in the table.
Do not insert other elements between SumXSquared, SumX, and SumN.
SumUnweightedN must come immediately before the Standard error or Sample variance element it belongs to.
If a question has multiple statistics and some of the additional elements are common to several of those elements, you can define the additional elements once, subject to the rules, and they are applied to all statistics.
Example
Here is a derived variable that generates age ranges based on the exact ages stored in the Age question. For more information about these types of derived variables, see Categorical bands for numeric, text, and date responses.
HowOld "Respondent's age" categorical [1..1]
{
HowOld24 "18 to 24" expression("Age <= 24"),
HowOld34 "25 to 34" expression("Age >= 24 And Age <= 34"),
HowOld44 "35 to 44" expression("Age >= 34 And Age <= 44"),
HowOld54 "45 to 54" expression("Age >= 44 And Age <= 54"),
HowOld64 "55 to 64" expression("Age >= 54 And Age <= 64"),
HowOld65 "65 plus" expression("age >= 65"),
SumXSquared [CalculationType="SumXSquared", Hidden=True, DataElement="",
ExcludedFromSummaries=True]
elementType(AnalysisSummaryData) multiplier(use Age),
SumX [CalculationType="SumX", Hidden=True, DataElement="",
ExcludedFromSummaries=True]
elementType(AnalysisSummaryData) multiplier(use Age),
SumN [CalculationType="SumN", Hidden=True, DataElement="",
ExcludedFromSummaries=True]
elementType(AnalysisSummaryData),
HowOldMean "Mean"
[CalculationType=Mean, HasNoData=True, ExcludedFromSummaries=True]
elementtype(AnalysisMean),
HowOldSD "Std. dev"
[CalculationType=Stddev, HasNoData=True, ExcludedFromSummaries=True]
elementtype(AnalysisStddev)
};
See also
Statistics using analysis elements