characteristics
defines three categories - persons (3 groups), income (2 groups) and age of
head (3 groups). A combination of those groups divides the space of households
characteristics into 18 groups in total, as shown in
Figure 8.2.
The table for the control totals dataset can be defined as follows:
year persons income total_number_of_households 2006 0 0 100100 2006 1 0 230000 2006 2 0 10000 2006 0 1 150000 2006 1 1 250000 2006 3 1 5000 2007 0 0 110000 . . .The characteristics table defines the groups of each characteristics:
characteristic min max persons 0 2 persons 2 4 persons 4 -1 income 0 49999 income 50000 -1 age_of_head 0 29 age_of_head 30 49 age_of_head 50 -1Note that
The model iterates over bins defined by the marginal characteristics. In this
case, it would iterate over the 6 groups marked by A,B,C,D,E,F in
Figure 8.2, it would then determine the number of households
that belong to each group in terms of their characteristics and compare it
with the control total for that group. If for example in bin F (shaded area in
the figure) there are 10 households to be created, the model would randomly
sample (with replacement) 10 bins from the 3 bins within F (formed by groups
on the axis ``age_of_head''), weighted by the number of existing households in
those bins. These are categories of the 10 new households. Then for each of
the 10 households it would randomly sample the actual value of each
characteristics. For example, for the very front cube of F, it would sample
income between 0 and
(rounded to the nearest 10), number of persons
between 4 and 8
and age of head between 15 and 29. If the difference between control total
and the number of households in F would call for removing households, the
model would randomly sample households belonging to F regardless to which bin
within F they belong.