POD 101: Quick Start for POD

Back to documentation index

Welcome to POD 101, a brief tutorial for running compute jobs on Penguin POD HPC clusters. If you do not have an account on POD, please complete our registration request form. A sales representative can help you get set up, answer any questions, and provide trial usage. 

This walk-through focuses on the command line syntax for POD's latest POD MT1 cluster, but all examples can be run through a web browser GUI interface using the Job Manager in the POD Portal. To follow along using the command line, please SSH to your login node using SSH on Linux or OSX, or using PuTTY on Windows. Detailed instructions for using SSH or PuTTY can be found in the Manage SSH Keys menu of the POD Portal.

A Simple POD Job

On POD, you do not run your computations inside your free login node VM, but instead in the HPC environment of high-end computational nodes. You submit Bash job scripts to the cluster's scheduler, requesting the resources you will need for your compute job. Then, inside your job script, you run your code. The syntax inside these scripts are like that of any standard Bash script with the exception of lines that start with #PBS.

A simple example is a job script that runs the /bin/date and /bin/hostname command on the compute cluster. Edit a job script with emacs, vim, or the nano editor called test.sub.

[penguin@podmt1 ~]$ vim test.sub

Inside this file, add the #PBS lines to request 1 node and 1 core (ppn) in the FREE queue (no charge) for a maximum of 1 minute:

Please note the FREE queue is only present in the MT1 cluster
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:01:00

Next, run any commands you would normally use inside a Bash script. For our example, let's run /bin/date and /bin/hostname:

#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:01:00


Save the file and submit the job script to the compute cluster using the qsub command. If you prefer a GUI interface, you can submit this job using the Job Manager in the POD portal.

[penguin@podmt1 ~]$ qsub test.sub

The qsub output will return the job id of your submitted job and you can see the job's status in the queue using qstat. The R status indicates the job is running:

[penguin@podmt1 ~]$ qstat

Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
15738.pod                  test.sub         penguin               0 R FREE

Once the job completes, qstat will show a C state and you will have two output files in your local folder. The .o file will show the Bash script's standard out (STDOUT) and the .e script will show the scripts' standard error (STDERR). The error file should be empty.

[penguin@podmt1 ~]$ cat test.sub.o15738
Tue Sep  3 23:54:51 UTC 2013n1
[penguin@podmt1 ~]$ cat test.sub.e15738
[penguin@podmt1 ~]$ 

As you can see, the job ran on compute node n1 and the system's date has been shown. Use this example to run any application you want on POD using Bash scripts.

An MPI Job Example

We have an MPI example available in /public/examples that you can copy to your home directory for experimentation. This example is designed to use the FREE queue and will not cost you anything to run.

[penguin@podmt1 ~]$ cp -a /public/examples $HOME/examples

Inside your examples folder, you will see three files:

  1. README, a file with information on the job
  2. IMB-MPI1, a compiled binary of Intel's MPI Benchmark
  3. imb.sub, a scheduler job script to run on the cluster

Look at imb.sub with cat, or edit it with emacs, vim, or the nano editor.

[penguin@podmt1 ~]$ vim imb.sub

The #PBS entries are set to request 12 cores on 2 nodes (24 cores total) for 5 minutes in the POD FREE queue. Details about the available queues are and #PBS options can be found here.

#PBS -l nodes=2:ppn=12
#PBS -l walltime=00:05:00

After the #PBS entries, you will note that the OpenMPI environment is loaded using the modules command. POD uses environment modules to enable you to easily load and unload software into your working environment. To see a full list of software, run the command modules avail. More details about using modules on POD can be found here.

module load openmpi/1.5.5/gcc.4.7.2

Next, you will see that this job prints some optional information about the job's environment, including special variables provided by the POD PBS TORQUE scheduler. Printing these details might be useful for debugging more complex jobs:

echo "================== nodes ===================="
echo "================= job info  ================="
echo "Date:   $(date)"
echo "Job ID: $PBS_JOBID"
echo "mpirun: $(which mpirun)"

After showing the optional job information, mpirun from the OpenMPI environment is used to run the IMB binary. We prefix the mpirun with time to get metrics on the time needed to run this mpirun execution. Note that OpenMPI on POD does not require machine files or any other flags. By default, OpenMPI will default to using all cores for the -np count, use all nodes available to you via the scheduler, and use Infiniband interconnects for MPI communication.

time mpirun ./IMB-MPI1 -mem 4.0

As before, you submit the job with qsub and watch its status with qstat. Again, feel free to use a browser to submit this job using the Job Manager from the POD portal if you prefer a GUI interface.

[penguin@podmt1 ~]$ qsub imb.sub

[penguin@podmt1 ~]$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
15743.pod                  MPI-EXAMPLE      penguin               0 Q FREE

[penguin@podmt1 ~]$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
15743.pod                  MPI-EXAMPLE      penguin        00:01:01 R FREE

[penguin@podmt1 ~]$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
15743.pod                  MPI-EXAMPLE      penguin        00:01:22 C FREE

In this example, the STDOUT and STDERR were merged into one file with #PBS -j oe and the job was given the name of MPI-EXAMPLE with #PBS -N MPI-EXAMPLE. You can see the long output of the IMB benchmark for the 24 cores in your current folder:

[penguin@podmt1 ~]$ cat MPI-EXAMPLE.o15743