The Lrun procedure

08-Jan-2009


Specification

The initial requirement is to produce a defined interface with the local job submission procedures, so that the Toric test cases can be run automatically following the build process. [ These jobs generally take too long to run from the command prompt ] In more detail, we require a procedure which

Such a procedure would be generally useful in other cases, if it could replace the usual job-specific scripts which are edited for each run .

Job Submission

The prototype system described was developed to submit job to the JET LoadLeveller system  . This requires two scripts -

  1. A job description, which is submitted to the job scheduler, and
  2. A ( unix shell ) script, which is a wrapper for the binary executable. This script also contains the MPI commands needed to start parallel processes.

The Lrun procedure prepares these two scripts, & inserts parameters as needed. The parameters are set by the script or makefile which submits the job, and may refer to ( shell ) environment variables. The parameters may also request user input - eg to define the number of processors to be requested for a parallel job

Implementation

A prototype Lrun procedure has been written in perl, for test purposes. This is called with two arguments -

An example script, which runs the toric_main code distributed with Transp, is described here .

Evaluation of this & earlier prototypes showed that, while the Job description file submitted to the batch system was almost standard, and could be prepared using symbol/value substitution alone, the shell wrapper scripts for different codes tended to require different features. Consequently the skeleton file for the job description is stored in the Lrun module, while the wrapper file skeleton is provided as an argument.

Variables

More important than the actual implementation is the definition of an agreed set of variables, particularly for the Job description file, so that programs using Lrun are portable between sites. The skeleton Job description for a JET batch job is
 

__DATA__                    #  ll script template
# @ executable     = &LLSCR&
# @ input          = &STDIN&
# @ output         = &STDOUT&
# @ error          = &STDERR&
# @ initialdir     = &RUNDIR&
# @ notify_user    = &USER&
# @ notification   = complete
--P jobtype        = openmpi
--P max_processors = &NPMIN&
--P min_processors = &NPMAX&
# @ queue
_END_

Variables are deliminated by the "&" character in the skeleton scripts. Lines starting "--P" are converted to script lines for parallel jobs, and skipped for serial jobs.

 Table 1 lists the variables used in Lrun, and additional variables which may be used in the wrapper scripts -
 

ARGS Arguments for the Executable  
EXE Executable binary  
EXEDIR Directory containing EXE  
INIT Initialisation file Defaults to /dev/null
LLSCR Wrapper script to be run This script is generated by Lrun
MPI MPI flag {Y|N}  
NPMIN Minimum # processors required MPI only
NPMAX Maximum # processors required MPI only
PID (Fairly) Unique process ID [ This is the ID of the perl process, which calls Lrun ]
RUNDIR Directory where the batch job runs LLSCR file is written to this directory
STDIN Input file  
STDOUT Output file  
STDERR Error output file  
USER Account to notify on job completion  

MPI

Submitting parallel jobs requires additional lines in the job description file ( to control processor allocation ). The Toric submission script Runone.pl initiates a parallel run with the following hash definitions -

if( $mpi ) {
     $hash{MPI}   = "Y";
     $hash{NPMIN} = "?NPMIN";
     $hash{NPMAX} = "?NPMAX";
 } else {
     $hash{MPI}   = "N";}

The initial "?" in the NPMIN, NPMAX value fields cause Lrun to prompt the user to enter values for these parameters.

Features

Lrun provides these facilities, in addition to the basic script editing operation -

Arguments
Arguments of the form Symbol=Value will be added to the Options hash, overriding any existing definitions
Environment variables
Strings of the form $VARIABLE in the Value field are interpreted as environment variable. the variable name is terminated by the first non-alphabetic character, or end of line. [ Escapes, quotes etc are not supported by this version ]
Process ID
The string $$ is replaced by the process ID of the process which generates the scripts , & is equivalent to the &PID& variable
User Input
The user is prompted to define values for variables where the value commences with a "?"
Job Submission
Lrun::go()  returns the path to the Job description file; Lrun::submit($file) submits the file for execution - see the example below.

Toric Example

A script which uses these features ( to submit the Toric test cases ) is described here

Future development

Error checking - Lrun checks that any environment variables referenced are defined, & will create RUNDIR if it does not exist ( to support creation of a new directory for each run )