Compilers & MPI Libraries¶
To use the Intel compiler installations on POD, you will need a valid Intel license that can be accessed from inside the POD cluster. There are three ways for customers to access Intel licenses:
- License File:
- Upload a valid license file into your
$HOME/intel/licensesdirectory to enable immediate access.
- FlexLM Server on POD:
- Penguin will host FlexLM servers to serve your license free of charge. If you wish to migrate your Intel license to POD and have it hosted on a FlexLM server, please contact the POD support team: email@example.com.
- Remote FlexLM Server:
- Alternatively, the POD support team can help you establish a secure tunnel to access an existing FlexLM server in your organization. Please contact the POD support team to help establish a tunnel to your license server: firstname.lastname@example.org.
The different versions of Intel software distributions available on POD can be listed by running the
module avail intel command. The example below shows a license file installed in the appropriate folder inside a user’s home directory and the versions currently available on the MT1 cluster.
$ ls $HOME/intel/licenses mypod.lic $ module -l avail intel - Package -----------------------------+- Versions -+- Last mod. ------ /public/modulefiles: intel/11.1.0 2013/04/25 13:19:03 intel/12.1.0 2013/12/13 18:37:52 intel/2015 2016/06/09 16:24:13 intel/2016 2016/10/06 1:17:13 intel/2018 2017/10/09 20:15:42 intel_aps/2018.beta 2017/05/05 19:05:31
Portland Group PGI® Compilers¶
Please contact the POD admin team to enable access to the PGI compilers on the MT1 & MT2 clusters. Use the
module avail pgi command to see the current list of available PGI distributions installed on POD. The example below shows the different versions currently available on the MT1 cluster.
$ module -l avail pgi - Package -----------------------------+- Versions -+- Last mod. ------ /public/modulefiles: pgi/10.9 2015/12/14 23:24:44 pgi/11.9 2015/12/14 23:24:18 pgi/13.6 2015/12/14 23:23:11 pgi/15.10 2016/11/08 19:51:22
Running MPI Jobs¶
Scaling up your jobs to use multiple CPU cores on multiple machines require the ability to share data and messages between individual processes. While, a Message Passing Interface (MPI) library can enable this functionality very easily. Your application must be written and compiled using an MPI library. Multiple different library implementations and versions are provided by default on both POD clusters for your use at both compile time and runtime. Common implementations available on POD include: OpenMPI, Platform MPI, Intel MPI, etc.
MPI Job Syntax¶
OpenMPI is an open source implementation of an MPI library and is readily available on POD for your MPI-enabled jobs. Since OpenMPI is already optimized for POD, there is no need to specify a machine file or
-np when using
mpirun at runtime. By default,
mpirun will launch an MPI rank per core using the nodes provided by the scheduler. Just like any other application on POD you will need to load the appropriate module for the specific implementation and version of MPI you intend to use. The following example will launch 96 MPI ranks on 8 nodes using OpenMPI 1.5.5.
#PBS -S /bin/bash #PBS -l nodes=8:ppn=12 #PBS -l walltime=01:00:00 module load openmpi/1.5.5/gcc.4.4.7 mpirun /path/to/binary exit $?
System Memory/CPU Core¶
MPI-enabled jobs run multiple parallel ranks on a single node so they will need to share the available system resources including memory. Your application may require a specific amount of memory-per-core to run efficiently. Detailed below are the current memory-per-core capacities, available by default, for each of the queues on the POD MT1 and MT2 clusters.
|Queue||Compute Node Architecture||Memory||Cores||Mem/Core|
|S30||Dual Intel® Xeon® Gold 6148 (Skylake)||384 GB||40||9.6 GB/core|
|B30||Dual Intel® E5-2600v4 Series (Broadwell)||256 GB||28||9.1 GB/core|
|T30||Dual Intel® E5-2600v3 Series (Haswell)||128 GB||20||6.4 GB/core|
|M40||Dual Intel® X5600 Series (Westmere)||48 GB||12||4.0 GB/core|
|H30||Dual Intel® E5-2600 Series (Sandy Bridge)||64 GB||16||4.0 GB/core|
|H30G||Dual Intel® E5-2600 Series (Sandy Bridge)||64 GB||16||4.0 GB/core|
Use Fewer Ranks for More Memory-Per-Core¶
A single CPU core on POD provides up to 9.6 GB of dedicated memory. If your application requires more memory than this, you must request the resources required to satisfy your memory requirements and also limit the number of MPI ranks launched. Use the
--loadbalance along with
-np to customize your memory availability and MPI rank count.
For instance, a 24 core job that needs 8 GB per MPI rank will require that you request 4 nodes from the
M40 queue using 48 total cores but only run using 24 MPI ranks. In this configuration half the CPU cores will be unused but there will be twice the amount of memory available for each MPI rank.
#PBS -S /bin/bash #PBS -N FewerRanksMPI-example #PBS -q M40 #PBS -j oe #PBS -l nodes=4:ppn=12 # 4 nodes x 12 cores = 48 cores #PBS -l walltime=01:00:00 # load OpenMPI module load openmpi/1.5.5/gcc.4.4.7 # Enter the PBS folder from which qsub is run cd $PBS_O_WORKDIR # limit mpirun to 24 cores and loadbalance the MPI ranks over all 4 nodes mpirun -np 24 --loadbalance /path/to/binary # alternatively, use --npernode # mpirun -np 24 --npernode 6 /path/to/binary exit $?
More application-specific example templates can be found in
/public/examples on MT1 and MT2. This example will run best on the MT1 cluster because it is using the
M40 queue. If you plan on using this script you will need to update the call to mpirun with the path to your MPI binary.
#PBS -S /bin/bash #PBS -N OpenMPI-example #PBS -q M40 #PBS -j oe #PBS -l nodes=4:ppn=12 #PBS -l walltime=01:00:00 # Load the ompi environment. Use 'module avail' from the # command line to see all available modules. module load openmpi/1.5.5/gcc.4.7.2 # Display some basics about the job echo echo "================== nodes ====================" cat $PBS_NODEFILE echo echo "================= job info =================" echo "Date: $(date)" echo "Job ID: $PBS_JOBID" echo "Queue: $PBS_QUEUE" echo "Cores: $PBS_NP" echo "mpirun: $(which mpirun)" echo echo "=================== run =====================" # Enter the PBS folder from which qsub is run cd $PBS_O_WORKDIR # Run your application with mpirun. Note that no -mca btl options # should be used to ensure optimal performance. Jobs will use # InfiniBand by default. time mpirun /path/to/binary retval=$? # Display date and return value echo echo "================== done =====================" echo "Date: $(date)" echo "retval: $retval" echo # vim: syntax=sh
OpenMPI is strongly encouraged on POD as the OpenMPI releases seen in
module avail openmpi are optimized for the POD InfiniBand environment. In the rare case where a commercial application requires the use of a different MPI implementation, below are some special considerations.
If your application requires running on MT1 using the Platform MPI library, it is necessary to add the following variables and update the call to
mpirun with the following options. Please note that the
$PBS_NODEFILE will be automatically generated by the scheduler for use inside the PBS TORQUE job.
#PBS -S /bin/bash #PBS -N PlatformMPI-example #PBS -q M40 #PBS -j oe #PBS -l nodes=4:ppn=12 #PBS -l walltime=01:00:00 # load Platform MPI module load platform_mpi/09.01.02 # Platfrom MPI-specific environemnt variables export MPI_MAX_REMSH=32 export MPI_REMSH=/usr/bin/bprsh # Enter the PBS folder from which qsub is run cd $PBS_O_WORKDIR # Platform MPI-specific call to mpirun mpirun -psm -hostfile $PBS_NODEFILE /path/to/binary exit $?
If your application requires an older version of Intel MPI, you will need to configure your job to appropriately use a TMI configuration that leverages InfiniBand and the Qlogic/Intel PSM libraries. The following example can be used as a template to run Intel MPI jobs with the optimal TMI configuration on MT1.
#PBS -S /bin/bash #PBS -N IntelMPI-example #PBS -q M40 #PBS -j oe #PBS -l nodes=4:ppn=12 #PBS -l walltime=01:00:00 # load Intel MPI module load intel/11.1.0 # Intel MPI-specifc environment variables export I_MPI_FABRICS=shm:tmi export TMI_CONFIG=/etc/tmi.conf export I_MPI_TMI_LIBRARY=/usr/lib64/libtmi.so export I_MPI_TMI_PROVIDER=psm export I_MPI_MPD_RSH=/usr/bin/ssh export I_MPI_DEBUG=5 # optional # PBS_NUM_NODES = number of compute nodes allocated fo job # PBS_NP = number of MPI ranks (cores in nodes=X:ppn=Y) # PBS_NODEFILE = /var/spool/torque/aux/<jobid> from mother superior # Enter the PBS folder from which qsub is run cd $PBS_O_WORKDIR # Start mpd processes on each compute node mpdboot -n $PBS_NUM_NODES -f $PBS_NODEFILE -r $I_MPI_MPD_RSH ## Run your program here mpdrun -np $PBS_NP /path/to/binary # Stop mpd processess on each compute node mpdallexit exit $?