How to set-up Cube Cluster for distributed processing on a network of computers?

Cube Cluster can be used to distribute model steps across multiple processing cores. These cores can be located on the same computer or on a network of computers. This post documents the steps required to set-up Cluster across a network of computers. This post assumes that your model steps are already set-up to be distributed using Cluster. If not, the model should be reviewed and DISTRIBUTE statements added at appropriate locations/steps in the model flow. For more information please refer to http://community.citilabs.com/t/how-to-set-up-distributemultistep-and-distributeintrastep/344

Step 1: Set-up a shared drive

Set-up a shared drive which can be accessed from all the computers in the network. This will be the location from which models should be run. Typically, the folder location on the main computer, where the model run will be started, is set-up as a shared drive. The drive letter for the shared drive should be the same on all computers. All file location references in model script will be pointing to the shared drive path, and the processing cores on networked computers will be reading and writing data to this location.

e.g. T:\ModelRuns

Step 2: Start Cluster nodes

Start Cluster nodes manually on all computers using the Cluster node management tool. Cluster nodes cannot be started from Voyager script when using multiple machines. They must be started using the Cluster node management tool on each of the computers. Identify how many Cluster nodes you would like to run on each computer and assign unique process number list for each computer. In the example below, the user has 3 computers with 16, 8 and 6 processing cores. The process list for each computer is set as noted below. The process list should be sequential and non-overlapping across the networked computers.

In the Cluster node management tool,

(i) Navigate to the model application folder in the shared drive location and enter the Cluster process Id used in the model. The model application folder is the location where the main application in your model is saved. This is the working folder location for the model run, where all Cluster node communications occur. If your model uses the COMMPATH keyword in Distribute statements, to set a folder location outside of the model applications folder, then Cluster nodes should be started in COMMPATH folder.

(ii) Enter the process list identified for that computer and start nodes.

(iii) Repeat steps (i) and (ii) on all computers with the appropriate process list.

For more information on the Cluster node management tool,
please refer to http://community.citilabs.com/t/how-to-start-cube-cluster-slave-nodes/362/1

Step 3: Open model

On the main computer, open model catalog from the shared drive. This will make sure the model script references the shared drive location.

Step 4: Update Cluster keys

Update any keys related to Cluster settings. Most models will have an input key which will allow the user to set the total number to processing cores, to be used for a run. Update this key value to match with the process list identified in Step 2. If the maximum number in the process list is 30, set the scenario key value to 30.

Step 5: Start model run

Start model run from the scenario manager on the main computer.

e.g.

Step 6: Close Cluster nodes
After model run is done, Cluster nodes can be closed on each individual computer using the Cluster node management tool.

1 Like