The Household sample
The Household sample provides an example of hierarchical data. Unlike the Museum and Short Drinks samples, which are based on real surveys, the Household sample has been created artificially to demonstrate hierarchical data. The Household sample is a small data set that includes a very small number of cases at each level. This is to make it easy to understand what happens when you aggregate the data at the different levels.
The Household sample comes in the following formats:
▪UNICOM Intelligence Data File. This consists of the UNICOM Intelligence metadata model (MDM) file called household.mdd and case data in the form of a UNICOM Intelligence Data File called household.ddf. Like the XML version of the household sample, the .Ex variables from the original Quanvert sample have been removed from the metadata file, but are present in the case data file. By default, this data set is installed into the [INSTALL_FOLDER]\IBM\SPSS\DataCollection\7\DDL\Data\Data Collection File\ folder.
▪Quanvert database. This is a Quanvert levels project. By default, this is installed into the [INSTALL_FOLDER]\IBM\SPSS\DataCollection\7\DDL\Data\Quanvert\Household folder.
▪Surveycraft database. This consists of a HOUSHLD.QDT file and a HOUSHLD.VQ file which are installed by default into the [INSTALL_FOLDER]\IBM\SPSS\DataCollection\7\DDL\Data\Surveycraft folder.
▪XML data set. This consists of an .mdd file (household.mdd) and case data in the form of an XML file, in which the data is stored as HDATA hierarchical tables. This data set is based on the Quanvert database. However, the .mdd file has been modified slightly. For example, the person loop has been expanded and some of the Quanvert-specific information, such as the .Ex variables has been removed. These variables are present in the case data, but are invisible when you connect to the case data using the .mdd file. By default, this data set is installed into the [INSTALL_FOLDER]\IBM\SPSS\DataCollection\7\DDL\Data\XML folder.
The Household sample represents the data collected using the following fictitious survey:
▪Household questions. Respondents are first asked a number of questions about their household as a whole, such as the address, age of the building, and number of rooms.
▪Person questions. Respondents are then asked a number of questions about each person in the household, such as the person's name, age, gender, and occupation, and a grid question that asks the number of days he or she watches various TV channels.
▪Overseas trip questions. Respondents are also asked a number of questions about each overseas trip that each person in their household has taken in the previous year (if any), such as the purpose of the trip, number of days he or she was away from home, and countries that were visited.
▪Vehicle questions. Finally, respondents are asked a number of questions about each vehicle that belongs to their household, such as the vehicle's type, color, and annual mileage, and a grid question that asks the respondent to rate the vehicle's features.
Flowchart showing the questionnaire logic
Loops called person, trip, and vehicle are used to ask the person, overseas trip, and vehicle questions, respectively. The loops are iterated (and therefore the questions are asked) as many times as necessary. For example, in a household of three people, the person loop will be iterated three times, whereas in a single-person household it will be iterated once. In a household that has no cars, bikes, or other vehicles, the vehicle questions will not be asked at all and the vehicle loop will have no iterations. Click below to see a diagram that provides a representation of the loops and variables in the metadata.
Diagram showing the loops and variables in the metadata
The structure of the tables in the case data in the hierarchical HDATA view of the data corresponds to the structure of the loops in the metadata. This means that because the trip loop is nested within the person loop, the trip table is a child of the person table. The two grids are also represented in the case data by hierarchical tables, each nested within its parent table. Click below to see a diagram that shows the case data for two households (the first and third) represented in the HDATA hierarchical tables. (For simplicity, not all of the variables are shown.)
Diagram showing the case data represented hierarchically
Notice that there are no trip or vehicle records for the second household that is shown. Click below to see a diagram that shows the relationship of the hierarchical tables .
Diagram showing the structure of the hierarchical tables
Household questions
Name | Description | Type |
---|
Household | Household serial number | Long |
Region | Region of main residence | Single response |
Address | Address | Text |
Tenure | Do you rent or own your home? | Single response |
HouseType | Type of accommodation | Single response |
AgeOfBuilding | Approximate age of the building | Single response |
NumRooms | Number of rooms | Long |
FloorArea | Approximate floor area in square meters | Double |
Pets | Number of pets in household | Long |
NumPersons | Number of persons in household | Long |
NumVehicles | Number of vehicles owned by the people in the household | Long |
Person questions
Name | Description | Type |
---|
Person | Person serial number | Long |
Age | Age last birthday | Long |
Gender | Gender | Single response |
Occupation | Occupational status | Single response |
Name | Name | Text |
Weight | Weight in pounds | Double |
Languages | Languages spoken | Multiple response |
Newspapers | Number of newspapers and magazines bought last week | Long |
TVDays | Number of days on which TV channels were watched last week | Grid |
NumTrips | Number of overseas trips | Long |
Overseas trip questions
Name | Description | Type |
---|
Trip | Trip serial number | Long |
Country | Countries visited | Multiple response |
DaysAway | The number of days spent away from home | Long |
Satisfaction | Overall satisfaction with the trip on a scale of 1 to 10 (where 10 represents the highest satisfaction level). | Double |
Modes | Number of different modes of transport used | Long |
Purpose | Purpose of the trip | Multiple response |
Vehicle questions
Name | Description | Type |
---|
Vehicle | Vehicle serial number | Long |
VehicleType | Type of vehicle | Single response |
VehicleAge | Age of vehicle | Single response |
Maintenance | Approximate annual maintenance costs | Double |
DaysUsed | Average number of days the vehicle is used in a week | Single response |
Mileage | Approximate annual mileage | Long |
YearsOwned | Number of years ownership | Double |
Rating | Rating of features | Grid |
See also