POD Job Scheduler

Back to documentation index

Scheduler Commands

POD MT1 uses the PBS TORQUE scheduler for job submission to the computational cluster. The following commands are available to all users.

qsub
qstat
qdel


Choosing your Queue

Choosing a scheduler queue for a compute job is required as different queues provide different compute node types. The following queues are available on POD.

All nodes have QDR Infiniband interconnects for MPI as well as 10GbE data networks.

Queue Name CPU Cores Per Node RAM Per Node Restrictions
T30 Intel 2.6GHz Haswell 20 128GB None
H30 Intel 2.6GHz Sandy Bridge 16 64GB None
H30G Intel 2.6GHz Sandy Bridge and dual NVIDIA 40 GPUs 16 64GB None
M40 Intel 2.9GHz Westmere 12 48GB None
FREE Intel 2.9GHz Westmere 12 48GB 24 Cores for 5 Minutes

Please note that submitting to the FREE queue with a resource requests larger than the restrictions will result in a job that will never run.

For pricing details, please see the pricing page.


Required Options

The following resources are a requirement for submitting a job to POD

Queue Name

The queue name determines your hardware and pricing.

#PBS -q FREE

Core Count

Each node has 12 cores available and the syntax is nodes=X:ppn=Y where X is the number of nodes and Y is the number of cores per node. Unless more than 4GB per core is needed for an application, you should specify ppn=12. For example, a 96 core job should be specified as...

#PBS -l nodes=8:ppn=12

...where 8 nodes x 12 cores = 96 cores. Each core gets 4GB of dedicated RAM.

Walltime

Specified as Hours:Minutes:Seconds

#PBS -l walltime=00:05:00


Recommended Options

Shell Environment

#PBS -S /bin/bash

Job Name

#PBS -N My_Job_Name

Script Output

You can manage your job's STDOUT and STDERR using -j, -o, and/or -e. For instance, this merges your STDOUT and STDERR:

#PBS -j oe

Alternatively, use -o and -e to name your STDOUT and STDERR files:

#PBS -o mystdout.txt
#PBS -e mystderr.txt

Project Accounting Options

In order to allow the POD Portal to provide CSV reports per account (or project name), you will need to submit jobs with the -A <account name> flag. The account string can be used to tag a job with a user-defined string that indicates a project, department or account. This flag has no bearing on scheduling and is not associated with UNIX accounts on POD. Strings for -A with spaces must be encapsulated in quotes. Examples are below:

[penguin@podmt1 ~]$ qsub -A Project_X job.sub
[penguin@podmt1 ~]$ qsub -A "Department A" job.sub
[penguin@podmt1 ~]$ qsub -A BillingAccount-001 job.sub

Scheduler Syntax Details

When submitting a job, resource requests can be submitted as a command line argument to qsub, or as a #PBS line inside a job script. For example, using...

qsub -q M40 -l walltime=24:00:00,nodes=5:ppn=12 <job script>

...is the same as specifying the following inside your job script:

#PBS -q M40
#PBS -l walltime=24:00:00,nodes=5:ppn=12

Scheduler Job Arrays

For jobs where the same executable is going to be run over and over again, with only a change in input parameters, it can be much more efficient to use job arrays. Job arrays allow you to write and submit one submission script, and have that script launch multiple jobs. The arguments for each job can be changed based on the array ID.

Job Arrays can be specified either as a CLI option, or in the submission script. The following examples are specifying the array options in the submission script.

The #PBS -t directive tells the scheduler that you are running a job array. You should specify a range of job array numbers that correlates to how many copies of the job you want to run.

#PBS -t 0-3

This would start four copies of the job, numbered 0, 1, 2, and 3, respectively. Within the job script, the environment variable $PBS_ARRAYID will be set to an integer representing the array id. A complete, simplistic example would be:

bash$ cat testarrays.sub 
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:05:00
#PBS -j oe
#PBS -N array
#PBS -t 0-3

times=( 0 10 20 30 )

echo $PBS_ARRAYID
echo Sleeping for ${times[$PBS_ARRAYID]}
sleep ${times[$PBS_ARRAYID]}

In this example, the $PBS_ARRAYID is used to index into the array times, allowing the sleep time to be set differently for each job in the array. The output from this job would look like this:

bash$ qsub testarrays.sub 
195610[].pod

bash$ qstat -t
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----         
195610[0].pod             array-0          test            00:00:00 C batch          
195610[1].pod             array-1          test            00:00:00 R batch          
195610[2].pod             array-2          test                   0 R batch          
195610[3].pod             array-3          test                   0 R batch 

bash$ cat array.o195610-*
0
Sleeping for 0
1
Sleeping for 10
2
Sleeping for 20
3
Sleeping for 30

In the example above, the qstat -t command was used to see the individual elements of the array. Without -t (i.e. qstat), the array will be condensed into one status for the entire array.

Similarly, if your input files were named in a successive fashion, like input-1, input-2, input-3, etc., you could use them as input to the same executable in the following manor:

#PBS -t 1-3

<executable> input-$PBS_ARRAYID

Alternatively, you could leave off the #PBS -t directive, and pass it on the command line:

qsub -t 1-3 testarrays.sub