next up previous index
Next: Dataset Pool Up: Opus Variables Previous: Opus Variables   Index


Variable Concept

As mentioned in Section 21.1, a dataset has a set of attributes, such as income or persons, that are stored in a file or database. We call such characteristics primary attributes. In addition, one is usually interested in attributes that are computed, for example using some transformation of primary attributes. We call those attributes variables, or computed attributes. They are simply handled as additional ``columns'' of a dataset to which they belong to, here denoted as ``parent dataset''.

In Opus, a variable is a class derived from the opus_core class Variable . (Section 22.3 gives additional details about this class.) Its name is identical to the name of the module in which it is implemented. The module is stored in a directory whose name corresponds to the name of the parent dataset. Note that the variable name must be all lower case.

The variable class must have a method compute() that returns a numpy array of variable values. The size of that array must correspond to the number of entries in the parent dataset. The compute() method takes an argument called dataset_pool containing references to the appropriate set of datasets to use for computing this variable. From the compute() method, the parent dataset can be accessed by self.get_dataset().

If the variable depends on other attributes, they must be listed in the method dependencies(), which returns a list of all dependent variables and attributes in their fully-qualified names (see Section 22.1.3 for details on attribute specification).

As an example, consider a variable ``is_in_wetland'' for the gridcell dataset locations from Sections 21.2.2 and 21.2.3. The variable returns True for entries whose percentage of wetland is more than a certain threshold, and False otherwise. The module is_in_wetland.py, containing a class is_in_wetland, is stored in the directory gridcell because

>>> locations.get_dataset_name()
'gridcell'

The class is defined as follows:

from opus_core.variables.variable import Variable
class is_in_wetland(Variable):

    def dependencies(self):
        return ["gridcell.percent_wetland"]

    def compute(self, dataset_pool):
        return self.get_dataset().get_attribute("percent_wetland") > \
         dataset_pool.get_dataset('urbansim_constant')["percent_coverage_threshold"]
The dependent attribute is a primary attribute and therefore specified as a dataset-qualified name. For our example, we populate the primary attribute:
>>> locations.add_primary_attribute(name="percent_wetland",
                                    data=[85,20,0,90,35,51,0,10,5])



Subsections
next up previous index
Next: Dataset Pool Up: Opus Variables Previous: Opus Variables   Index
info (at) urbansim.org