Two-sample Z-test on proportions
Quick reference
To request a two-sample Z-test, type:
stat=z2
on the tab statement.
More information
The two-sample Z-test is a table-level statistic. It is used to test differences between column percentages in a single row of a table. For example, you want to test whether younger women are as likely as older women to have full-time jobs; that is, to compare the proportions of women with full-time jobs across groups of women in different age-groups.
This test is produced by the option stat=z2 on the tab statement. The table must consist of a base row and one row of basic counts only. The test calculates a Z-value comparing each column percentage with each of the other column percentages, and produces a triangular table showing all the Z-values and their associated significance levels. The triangular table is labeled with the text ‘Z TEST – TYPE 2’.
Points to remember are:
▪The row of basic counts defines an attribute which respondents in that row have, for example, the attribute of having a full-time job.
▪The percentages (proportions) which are compared are always calculated for the test by dividing the count in each cell of the row to be tested by the corresponding cell in the base row. It is not necessary for the column percentages to be printed using the option op=2 (though you might find it confusing to use a two-sample Z-test and print the row or total percents instead).
▪The columns of the table should define the different groups of people in such a way that each group is mutually exclusive — for example, age groups or sex. If the column axis defines more than one set of mutually exclusive elements the test will still be printed, but the comparisons between elements which are not mutually exclusive are meaningless and should be ignored. For example, if the column axis contains both sex and age breakdowns, the comparison between, say, ‘Female’ and ‘Age 18–25’ must be ignored since some respondents may be women and aged 18–25.
▪The Z-tests may give misleading results when the bases from which proportions are calculated are small. In this case, tests involving a column whose base is less than 10 should be treated as approximate only. Such columns should preferably be combined with the nearest logical equivalent.
▪The calculation for Z subtracts the first proportion from the second, rather than the more usual method of subtracting the second proportion from the first.
Example
The Quantum program below compares the proportions of women in full-time employment between different age-groups:
tab ftjob age;stat=z2
ttlJob Status
ttlBase: All Women
l ftjob
col 145;Base;Full-Time
l age
col 108;Base;18–24;25–34;35–44;45–54;55+
produces:
Job Status Base: All Women Base 18-24 25-34 35-44 45-54 55+ Base 605 96 194 91 126 98 Full-Time 297 29 107 66 75 20 Z TEST - TYPE 2 18-24 25-34 35-44 45-54 25-34 2.388 0.017
35-44 3.800 2.273 0.000 0.023
45-54 2.687 0.586 -1.615 0.007 0.558 0.106
55+ -0.776 -2.877 -4.100 -3.145 0.438 0.004 0.000 0.002 |
This example shows that there is no significant difference between the 18-24 and 55+ age groups in the proportion of respondents in full time employment. However, these two age groups differ significantly (a 5% or higher significance level) from each of the other age groups. There is an additional difference between the proportion in full time employment between the 25-34 and 35‑44 age groups.
See also