next up previous index
Next: Dataset Pool Up: Opus Variables Previous: Opus Variables   Index


Variable Concept

As mentioned in Section 6.1, a dataset has a set of attributes, such as income or persons, that are stored in a file or database. We call such characteristics primary attributes. In addition, one is usually interested in attributes that are computed, for example using some transformation of primary attributes. We call those attributes variables, or computed attributes. They are simply handled as additional ``columns'' of a dataset to which they belong to, here denoted as ``parent dataset''.

In Opus, a variable is a class derived from the opus_core class Variable . (Section 7.4 gives additional details about this class.) Its name is identical to the name of the module in which it is implemented. The module is stored in a directory whose name corresponds to the name of the parent dataset. Note that the variable name must be all lower case.

The variable class must have a method compute() that returns a numpy array of variable values. The size of that array must correspond to the number of entries in the parent dataset. The compute() method takes an argument called dataset_pool containing references to the appropriate set of datasets to use for computing this variable. The parent dataset can be accessed from the compute() method by self.get_dataset().

If the variable depends on other attributes,they must be listed in the method dependencies(), which returns a list of all dependent variables and attributes in their fully-qualified names (see Section 7.2.4 for details on attribute specification).

As an example, consider a variable ``is_in_wetland'' for the gridcell dataset locations from Section 6.2.2 and 6.2.3. The variable returns True for entries whose percentage of wetland is more than a certain threshold, and False otherwise. The module is_in_wetland.py, containing a class is_in_wetland, is stored in the directory gridcell because

>>> locations.get_dataset_name()
'gridcell'

The class is defined as follows:

from opus_core.variables.variable import Variable
class is_in_wetland(Variable):

    def dependencies(self):
        return ["gridcell.percent_wetland"]

    def compute(self, dataset_pool):
        return self.get_dataset().get_attribute("percent_wetland") > \
         dataset_pool.get_dataset('urbansim_constant')["percent_coverage_threshold"]
The dependent attribute is a primary attribute and therefore specified as a dataset-qualified name. For our example, we populate the primary attribute:
>>> locations.add_primary_attribute(name="percent_wetland",
                                    data=[85,20,0,90,35,51,0,10,5])



Subsections
next up previous index
Next: Dataset Pool Up: Opus Variables Previous: Opus Variables   Index
info (at) urbansim.org