Opus expressions are written in a domain-specific programming language called Tekoa (Chapter 13). Each newly-encountered expression is compiled into an automatically generated subclass of Variable, with similar functionality to that described above. This section describes how this compilation is done. For the most part, the implementation shouldn't be relevant to the Opus user. But for the curious, for programmers who want to modify or extend the system, or (heaven forfend) in case something goes wrong, here is a description of the implementation. The code to implement expressions is in opus_core.variables.
Regarding Tekoa extensions, one straightforward extension is to add additional functions to the language. The currently available unary functions are listed in Section 13.4. These all operate on numpy arrays. Additional functions in the domain-specific language can be supported by adding additional definitions to opus_core.variables.functions. (If you believe this extension would be of general interest, please coordinate with the Opus/UrbaSim implementors so that it can find its way into the code base.)
When a new expression is encountered, the system automatically compiles a new subclass of Variable that implements the computation defined by that expression. If the expression is a simple attribute or fully-qualified variable, evaluating the expression reduces to getting the value of the attribute or computing the value of the existing variable. Otherwise, the expression system generates and compiles a new variable to implement an expression. It keeps a cache of expressions that have already been processed, so that autogenerated variables can be reused when possible. These autogenerated variables have names like autogenvar034. They are compiled and live just in the current process -- they aren't stored on disk, so that the user never needs to see them, and so that different processes running on the same machine don't interfere with each other.
Since expressions use standard Python syntax, they can be parsed using the
standard Python parser module, rather than needing to write one. The parse
tree for the expression is analyzed and the dependencies extracted to
generate the dependencies method -- the user doesn't need to declare the
dependencies for an expression. The compute() method for the
autogenerated variable includes the user's expression directly as part of
the method. To enable this to work correctly, the method includes
statements to set up the local environment in the method so that all of the
names are properly bound. Here is an example. Suppose the input
expression is ln_bounded(urbansim.gridcell.population). Then the
automatically generated class will be:
class autogenvar034(Variable):
def dependencies(self):
return ['urbansim.gridcell.population']
def name(self):
return 'ln_bounded(urbansim.gridcell.population)'
def compute(self, dataset_pool):
urbansim = DummyName()
urbansim.gridcell = DummyDataset(self, 'gridcell', dataset_pool)
urbansim.gridcell.population = \
self.get_dataset().get_attribute('population')
return ln_bounded(urbansim.gridcell.population)
The name of the class is generated (there is a class variable autogen_number in the class AutogenVariableFactory that starts at 0 and gets incremented each time it's used in a new name).
The dependencies method is constructed by parsing the expression and finding all of the other variables that it references, and putting those into the returned list.
The compute method ends with a return statement that just returns expr. To make this work, we need to provide local bindings for e.g. urbansim.gridcell.population. We bind a local variable (named urbansim in the example) to an instance of DummyName, whose sole purpose in life is to have an attribute gridcell (and maybe other attributes if there are multiple dependencies). Then urbansim.gridcell is bound to an instance of DummyDataset, which is used in place of a real dataset in the autogenerated code. We then add a population attribute to urbansim.gridcell, bound to the value of the appropriate dataset attribute. (We use the dummy dataset rather than adding attributes to the real dataset, which might interfere with other attributes or not be garbage-collected as soon as they might otherwise be.) For the get_attribute call to get the value of the population attribute, we use the short version of the name - its value should already have been computed by virtue of being listed in the dependencies() method.
If the expression includes an alias, for example
pop = ln_bounded(urbansim.gridcell.population), then the code is all
the same as above, except that the final return statement is replaced with
pop = ln_bounded(urbansim.gridcell.population)
return pop
The aggregate, disaggregate, and number_of_agents methods are defined on DummyDataset, so that they can be used in expressions.