UrbanSim can be started via the ``Run Manager'' (see Section 4.1.2) which is controlled by a user-defined configuration. The following code contains a fully specified configuration that influences behaviour of the run manager. Mandatory entries and default values for optional entries are marked in the comments. The actual values for the listed entries are only examples.
Note: we are in the process of splitting the following configuration information into separate parts. Once we are done with that refactoring, we will update the documentation to the new arrangement.
from opus_core.configurations.database_configuration import DatabaseConfiguration
from opus_core.configurations.baseyear_cache_configuration \
import BaseyearCacheConfiguration
from urbansim.configurations.creating_baseyear_cache_configuration \
import CreatingBaseyearCacheConfiguration
run_configuration = {
'model_system':'urbansim.model_coordinators.model_system', # mandatory
'description':'baseline with travel model', # default: 'No description'
'cache_directory':'d:/urbansim_cache/', # mandatory
'creating_baseyear_cache_configuration': CreatingBaseyearCacheConfiguration(
# default: 'urbansim_tmp'+random string
cache_directory_root = 'd:/urbansim_cache',
# mandatory
cache_mysql_data = 'urbansim.model_coordinators.cache_mysql_data',
cache_from_mysql = False, # default: True
# mandatory if 'cache_from_mysql' is False
baseyear_cache = BaseyearCacheConfiguration(
# mandatory for this block
existing_cache_to_copy = 'd:/urbansim_cache/run_397.2006_05_23_18_21',
# default: all years in 'existing_cache_to_copy'
years_to_cache = range(1996,2001)
},
tables_to_cache = [ # default: []
'gridcells',
'households',
'jobs',
'zones'
]
tables_to_cache_nchunks = { # default: each table defaults to 1
'gridcells':2,
},
tables_to_copy_to_previous_years = { # default: no copied tables
'development_type_groups':1996, # table name and year to put it in
'development_types':1996,
'development_type_group_definitions':1996,
'urbansim_constants': 1996,
},
),
'input_configuration': DatabaseConfiguration( # mandatory
host_name = os.environ['MYSQLHOSTNAME'] # mandatory
user_name = 'urbansim', # mandatory
password = os.environ['MYSQLPASSWORD'], # mandatory
database_name = 'PSRC_2000_baseyear', # mandatory
)
'output_configuration': DatabaseConfiguration( # default: No output
# configuration
host_name = os.environ['MYSQLHOSTNAME'] # mandatory for this block
user_name = 'urbansim', # mandatory for this block
password = os.environ['MYSQLPASSWORD'], # mandatory for this block
database_name = 'PSRC_2000_output', # mandatory for this block
},
'base_year': 2000, # default: read from table
# 'base_year' in
# 'db_input_database'
'years': (2001, 2030), # mandatory
}
The 'model_system' entry is the full Opus path to the model system that
will be used by the run manager to run/estimate a set of models.
The 'cache_mysql_data' entry is the full Opus path to the class to use to
create a baseyear cache from the baseyear data in the MySQL database. The
'urbansim.model_coordinators.cache_mysql_data' version creates both the
baseyear cache, and unrolls the gridcell data to populate prior years with the
gridcell dataset (see Section 4.1.3).
Entry 'creating_baseyear_cache_configuration' contains the configuration
for creating the baseyear cache.
Entry 'cache_directory_root' is the root directory where data should be
cached during processing. The actual cache directory is created by adding
the run number and date-time string to this directory.
The 'input_configuration' is a DatabaseConfiguration object
that determines the MySQL database with the base year
data.
Entry 'output_configuration' is a DatabaseConfiguration
object that determines the MySQL database into to which
to write any database tables related to the results of the
simulation run. Now that indicators are computed from the attribute
cache, the output database is only needed if you wish to use the SQL
indicators. Before starting the simulation, the run manager will
remove any tables in the output_database, so be sure it doesn't
contain information you want to keep.
Entry 'years' determines for what
years the simulation should run as a tuple with first and last year to run.
By default, the run manager caches all tables from the input database into the
binary baseyear cache on which then the simulation runs. If
only selected tables should be cached, they can be put into
'tables_to_cache'. Note that the simulation itself then does not use the
database anymore, all data are retreived from baseyear cache
and written to the simulation cache . That means, if the
entry 'tables_to_cache' is used, the user must ensure that it contains
all tables that are used by the simulation.
If a database table is so large that Python runs out of memory when copying it
to cache, you can reduce memory usage (but increase the time it takes to cache
the data) by increasing the number of ``chunks'' in which the dataset's
attributes are read from the table. By
default, all attributes of a table are read in a single chunk. Setting the
'tables_to_cache_nchunks' configuration for a model will tell the caching code
to use that many chunks. For instance, if a dataset has 11 attributes, setting
'tables_to_chunk_nchunks' to 3 will use three chunks loading 4, 4, and 3
attributes, in each chunk.
For big tables, the caching process can be a very time-consuming task. Often
the baseyear cache is available from previous runs. Thus,
one can set the entry 'cache_from_mysql' to False and define the
'baseyear_cache' block. The directory with the already cached data
should be put into the entry existing_cache_to_copy. The run manager then
copies data from that directory into the baseyear cache for this run. If you
want to copy only selected years, they can be specified in the entry
years_to_cache as a list of those years; by default all years are
copied. Note that this behaviour can be alternatively controlled directly from
the command line (see description of start_run in 4.1.2)
which has priority over entries in this configuration.
The 'tables_to_copy_to_previous_years' entry is used when a
lag variable needs to compute data for before the base year, and that
computation requires some of the "invariant" data that was copied from the
baseyear database into the baseyear cache. If this is the case, add those
database tables to the list of tables in the configuration's
'tables_to_copy_to_previous_years' entry, and indicate the year to which
to copy the tables. In general, it is safe to copy the tables to the earliest
year created by the unroll gridcell process. You can determine what this year
is by examining the year directories created in your baseyear cache.
(Note: we plan to change this to a better design.)
There are several run manager configurations in Opus. See for example the directory psrc/configs for configuration of different PSRC runs.