Developer Documentation Library > Data Model > UNICOM Intelligence Data Model samples > Sample data > The Household sample
 
The Household sample
The Household sample provides an example of hierarchical data. Unlike the Museum and Short Drinks samples, which are based on real surveys, the Household sample has been created artificially to demonstrate hierarchical data. The Household sample is a small data set that includes a very small number of cases at each level. This is to make it easy to understand what happens when you aggregate the data at the different levels.
The Household sample comes in the following formats:
UNICOM Intelligence Data File. This consists of the UNICOM Intelligence metadata model (MDM) file called household.mdd and case data in the form of a UNICOM Intelligence Data File called household.ddf. Like the XML version of the household sample, the .Ex variables from the original Quanvert sample have been removed from the metadata file, but are present in the case data file. By default, this data set is installed into the [INSTALL_FOLDER]\IBM\SPSS\DataCollection\7\DDL\Data\Data Collection File\ folder.
Quanvert database. This is a Quanvert levels project. By default, this is installed into the [INSTALL_FOLDER]\IBM\SPSS\DataCollection\7\DDL\Data\Quanvert\Household folder.
Surveycraft database. This consists of a HOUSHLD.QDT file and a HOUSHLD.VQ file which are installed by default into the [INSTALL_FOLDER]\IBM\SPSS\DataCollection\7\DDL\Data\Surveycraft folder.
XML data set. This consists of an .mdd file (household.mdd) and case data in the form of an XML file, in which the data is stored as HDATA hierarchical tables. This data set is based on the Quanvert database. However, the .mdd file has been modified slightly. For example, the person loop has been expanded and some of the Quanvert-specific information, such as the .Ex variables has been removed. These variables are present in the case data, but are invisible when you connect to the case data using the .mdd file. By default, this data set is installed into the [INSTALL_FOLDER]\IBM\SPSS\DataCollection\7\DDL\Data\XML folder.
The Household sample represents the data collected using the following fictitious survey:
Household questions. Respondents are first asked a number of questions about their household as a whole, such as the address, age of the building, and number of rooms.
Person questions. Respondents are then asked a number of questions about each person in the household, such as the person's name, age, gender, and occupation, and a grid question that asks the number of days he or she watches various TV channels.
Overseas trip questions. Respondents are also asked a number of questions about each overseas trip that each person in their household has taken in the previous year (if any), such as the purpose of the trip, number of days he or she was away from home, and countries that were visited.
Vehicle questions. Finally, respondents are asked a number of questions about each vehicle that belongs to their household, such as the vehicle's type, color, and annual mileage, and a grid question that asks the respondent to rate the vehicle's features.
Flowchart showing the questionnaire logic
This graphic is described in the surrounding text.
Loops called person, trip, and vehicle are used to ask the person, overseas trip, and vehicle questions. The loops are iterated (and the questions are asked) as many times as necessary. For example, in a household of three people, the person loop is iterated three times; in a single-person household, it is iterated once. In a household that has no cars, bikes, or other vehicles, the vehicle questions are not asked at all and the vehicle loop has no iterations.
Diagram showing the loops and variables in the metadata
This graphic is described in the surrounding text.
The structure of the tables in the case data in the hierarchical HDATA view of the data corresponds to the structure of the loops in the metadata. This means that because the trip loop is nested within the person loop, the trip table is a child of the person table. The two grids are also represented in the case data by hierarchical tables, each nested within its parent table.
Diagram showing the case data represented hierarchically
This diagram shows the case data for two households (the first and third) represented in the HDATA hierarchical tables. (For simplicity, not all of the variables are shown.)
This graphic is described in the surrounding text.
There are no trip or vehicle records for the second household that is shown.
Diagram showing the structure of the hierarchical tables
This graphic is described in the surrounding text.
Household questions
Name
Description
Type
Household
Household serial number
Long
Region
Region of main residence
Single response
Address
Address
Text
Tenure
Do you rent or own your home?
Single response
HouseType
Type of accommodation
Single response
AgeOfBuilding
Approximate age of the building
Single response
NumRooms
Number of rooms
Long
FloorArea
Approximate floor area in square meters
Double
Pets
Number of pets in household
Long
NumPersons
Number of persons in household
Long
NumVehicles
Number of vehicles owned by the people in the household
Long
Person questions
Name
Description
Type
Person
Person serial number
Long
Age
Age last birthday
Long
Gender
Gender
Single response
Occupation
Occupational status
Single response
Name
Name
Text
Weight
Weight in pounds
Double
Languages
Languages spoken
Multiple response
Newspapers
Number of newspapers and magazines bought last week
Long
TVDays
Number of days on which TV channels were watched last week
Grid
NumTrips
Number of overseas trips
Long
Overseas trip questions
Name
Description
Type
Trip
Trip serial number
Long
Country
Countries visited
Multiple response
DaysAway
The number of days spent away from home
Long
Satisfaction
Overall satisfaction with the trip on a scale of 1 to 10 (where 10 represents the highest satisfaction level).
Double
Modes
Number of different modes of transport used
Long
Purpose
Purpose of the trip
Multiple response
Vehicle questions
Name
Description
Type
Vehicle
Vehicle serial number
Long
VehicleType
Type of vehicle
Single response
VehicleAge
Age of vehicle
Single response
Maintenance
Approximate annual maintenance costs
Double
DaysUsed
Average number of days the vehicle is used in a week
Single response
Mileage
Approximate annual mileage
Long
YearsOwned
Number of years ownership
Double
Rating
Rating of features
Grid
See also
Understanding hierarchical data
Understanding population levels
Sample data