next up previous index
Next: Model System Configuration Up: Configurations Previous: Configurations   Index


Run Manager Configuration

UrbanSim can be started via the ``Run Manager'' (see Section 4.1.2) which is controlled by a user-defined configuration. The following code contains a fully specified configuration that influences behaviour of the run manager. Mandatory entries and default values for optional entries are marked in the comments. The actual values for the listed entries are only examples.

Note: we are in the process of splitting the following configuration information into separate parts. Once we are done with that refactoring, we will update the documentation to the new arrangement.

from opus_core.configurations.database_configuration import DatabaseConfiguration
from opus_core.configurations.baseyear_cache_configuration \
    import BaseyearCacheConfiguration
    
from urbansim.configurations.creating_baseyear_cache_configuration \
    import CreatingBaseyearCacheConfiguration


run_configuration = {
    'model_system':'urbansim.model_coordinators.model_system', # mandatory
    'description':'baseline with travel model',      # default: 'No description'
    'cache_directory':'d:/urbansim_cache/',	         # mandatory
    'creating_baseyear_cache_configuration': CreatingBaseyearCacheConfiguration(
        # default: 'urbansim_tmp'+random string
        cache_directory_root = 'd:/urbansim_cache',

        # mandatory
        cache_mysql_data = 'urbansim.model_coordinators.cache_mysql_data',

        cache_from_mysql = False, # default: True

        # mandatory if 'cache_from_mysql' is False
        baseyear_cache = BaseyearCacheConfiguration(
            # mandatory for this block
            existing_cache_to_copy = 'd:/urbansim_cache/run_397.2006_05_23_18_21',
            
            # default: all years in 'existing_cache_to_copy'
            years_to_cache = range(1996,2001)
            },

        tables_to_cache = [ # default: []
            'gridcells',
            'households',
            'jobs',
            'zones'
            ]

        tables_to_cache_nchunks = { # default: each table defaults to 1
            'gridcells':2,
            },

        tables_to_copy_to_previous_years = { # default: no copied tables
            'development_type_groups':1996, # table name and year to put it in
            'development_types':1996,
            'development_type_group_definitions':1996,
            'urbansim_constants': 1996,
            },
        ),

    'input_configuration': DatabaseConfiguration(    # mandatory
        host_name = os.environ['MYSQLHOSTNAME']      # mandatory
        user_name = 'urbansim',                      # mandatory
        password = os.environ['MYSQLPASSWORD'],      # mandatory
        database_name = 'PSRC_2000_baseyear',        # mandatory
        )
    'output_configuration': DatabaseConfiguration(   # default: No output
                                                     #     configuration
        host_name = os.environ['MYSQLHOSTNAME']      # mandatory for this block
        user_name = 'urbansim',                      # mandatory for this block
        password = os.environ['MYSQLPASSWORD'],      # mandatory for this block
        database_name = 'PSRC_2000_output',          # mandatory for this block
        },
    'base_year': 2000,                               # default: read from table
                                                     #     'base_year' in
                                                     #     'db_input_database'
    'years': (2001, 2030),                           # mandatory
}
The 'model_system' entry is the full Opus path to the model system that will be used by the run manager to run/estimate a set of models.

The 'cache_mysql_data' entry is the full Opus path to the class to use to create a baseyear cache from the baseyear data in the MySQL database. The 'urbansim.model_coordinators.cache_mysql_data' version creates both the baseyear cache, and unrolls the gridcell data to populate prior years with the gridcell dataset (see Section 4.1.3).

Entry 'creating_baseyear_cache_configuration' contains the configuration for creating the baseyear cache.

Entry 'cache_directory_root' is the root directory where data should be cached during processing. The actual cache directory is created by adding the run number and date-time string to this directory.

The 'input_configuration' is a DatabaseConfiguration object that determines the MySQL database with the base year data.

Entry 'output_configuration' is a DatabaseConfiguration object that determines the MySQL database into to which to write any database tables related to the results of the simulation run. Now that indicators are computed from the attribute cache, the output database is only needed if you wish to use the SQL indicators. Before starting the simulation, the run manager will remove any tables in the output_database, so be sure it doesn't contain information you want to keep.

Entry 'years' determines for what years the simulation should run as a tuple with first and last year to run.

By default, the run manager caches all tables from the input database into the binary baseyear cache on which then the simulation runs. If only selected tables should be cached, they can be put into 'tables_to_cache'. Note that the simulation itself then does not use the database anymore, all data are retreived from baseyear cache and written to the simulation cache . That means, if the entry 'tables_to_cache' is used, the user must ensure that it contains all tables that are used by the simulation.

If a database table is so large that Python runs out of memory when copying it to cache, you can reduce memory usage (but increase the time it takes to cache the data) by increasing the number of ``chunks'' in which the dataset's attributes are read from the table. By default, all attributes of a table are read in a single chunk. Setting the 'tables_to_cache_nchunks' configuration for a model will tell the caching code to use that many chunks. For instance, if a dataset has 11 attributes, setting 'tables_to_chunk_nchunks' to 3 will use three chunks loading 4, 4, and 3 attributes, in each chunk.

For big tables, the caching process can be a very time-consuming task. Often the baseyear cache is available from previous runs. Thus, one can set the entry 'cache_from_mysql' to False and define the 'baseyear_cache' block. The directory with the already cached data should be put into the entry existing_cache_to_copy. The run manager then copies data from that directory into the baseyear cache for this run. If you want to copy only selected years, they can be specified in the entry years_to_cache as a list of those years; by default all years are copied. Note that this behaviour can be alternatively controlled directly from the command line (see description of start_run in 4.1.2) which has priority over entries in this configuration.

The 'tables_to_copy_to_previous_years' entry is used when a lag variable needs to compute data for before the base year, and that computation requires some of the "invariant" data that was copied from the baseyear database into the baseyear cache. If this is the case, add those database tables to the list of tables in the configuration's 'tables_to_copy_to_previous_years' entry, and indicate the year to which to copy the tables. In general, it is safe to copy the tables to the earliest year created by the unroll gridcell process. You can determine what this year is by examining the year directories created in your baseyear cache. (Note: we plan to change this to a better design.)

There are several run manager configurations in Opus. See for example the directory psrc/configs for configuration of different PSRC runs.


next up previous index
Next: Model System Configuration Up: Configurations Previous: Configurations   Index
info (at) urbansim.org