The cluster uses a queuing system in order to manage all of the programs that people want to run efficiently. The system has two parts. Torque is the Resource Manager that keeps track of the use of the components of the cluster. Torque has its origins with the Portable Batch System (PBS) and there are still remnants of PBS in Torque. The other part of our queuing system is Moab which is the program that schedules all programs to run on the cluster. It works closely with Torque.
In order to run your program you need to set up a “job” to submit to the Queuing System. The steps to setting up a job are:
- compile your program or make sure we have it installed on the cluster. Check for installed software here. For help with compiling see: Compilers.
- prepare a script to submit to the Queuing System
- submit the script to put your job in the queue using the qsub command
- keep track of your job with the qstat command
- if needed delete the job from the queue with the qdel command
Here is a script for a sample program that calculates Pi in parallel using MPI. It assumes that this script is in the dame directory as the pi_MPI program and that pi_MPI has already been compiled.
#PBS -q batch
#PBS -N pi_MPI
#PBS -l procs=8
#PBS -l walltime=00:10:00
module load mvapich2/intel/1.8
The blue lines all begin with #PBS. These are Torque directives that give the Torque resource manager information about how you want to run the job and what resources the job needs. The use of the string #PBS goes back to when Torque was spun off of PBS and for compatibility they just stuck with it. This example uses the most basic directives that are required for a job:
#PBS -q batch -q tells Torque to use the "batch" queue #PBS -N pi_MPI -N tells Torque what you want to call the job (as seen by qstat) #PBS -l procs=8 -l specifies the resources required. procs=8 says you need 8 processes #PBS -l walltime=00:10:00 this line says the program needs at most 10 minutes to run
When the job starts to run Torque will set up a number of environment variables that the job can access. One of them is $PBS_O_WORKDIR:
cd $PBS_O_WORKDIR cd to the directory that the script was submitted from
This allows you to create generic scripts that don’t need to be changed if the directory name changes.
module load mvapich2/intel/1.8 load any modules that are needed
Since this is an MPI program, we need to load the same MPI module that was used when the program was compiled. If you use packages that were already installed (you didn’t compile any code) then you need to load the module(s) for the software that you want to use.
This is the line that actually starts the program running.
Submitting your script:
Once you have a script ready (for this example the script name is pi-mpi.pbs) you can submit it to the queuing system. :
If everything goes all right Torque will reply with something like:
It is as simple as that. We now have a job with job id 4050 To see the status of this job use the qstat command:
marconi.localdomain: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 4047.marconi.loc user1 batch program1 41322 8 40 -- 10:00 R 02:28 4048.marconi.loc user2 batch program2 -- 1 12 -- 1000: Q -- 4049.marconi.loc user2 batch program3 51233 1 12 -- 30:00 R 00:01 4050.marconi.loc cousins batch pi_MPI 51375 1 1 -- 00:10 R --
Job 4050 is listed at the bottom and the State (the column labeled “S” second from the right) is R for Running. This is a pretty short running job so soon after the state will change to “C” for Completed:
4050.marconi.loc cousins batch pi_MPI 51375 1 1 -- 00:10 C 00:00
Another way to get more information about the job is with the checkjob command:
job 4050 AName: pi_MPI State: Running Creds: user:cousins group:cousins class:batch WallTime: 00:00:31 of 00:10:00 SubmitTime: Fri Jul 13 12:14:43 (Time Queued Total: 00:00:00 Eligible: 00:00:00) StartTime: Fri Jul 13 12:14:43 NodeMatchPolicy: EXACTNODE Total Requested Tasks: 8 Req TaskCount: 8 Partition: base NodeCount: 1 Allocated Nodes: [n27:8] IWD: /home/cousins/pi_MPI StartCount: 1 Flags: RESTARTABLE Attr: checkpoint StartPriority: 1 Reservation '4050' (-00:00:51 -> 00:09:09 Duration: 00:10:00)
Among other information, this shows that it is running on one node, that node is n27 and it has been allocated 8 processes/cores.
Once the job has finished Torque will create two files in the directory that the job was run in. One will contain the Standard Output for the job and the other file will be Standard Error. Since we didn’t specify a #PBS directive to rename these (or to combine them) the format of the file names is jobname.ojobid and jobname.ejobid. In our case it is:
ls -l *4050
-rw------- 1 cousins cousins 0 Jul 13 12:14 pi_MPI.e4050 -rw------- 1 cousins cousins 576 Jul 13 12:15 pi_MPI.o4050
It is a good that the StdErr file is empty (no errors). However, with MVAPICH2 you may see messages in this file like:
librdmacm: couldn’t read ABI version.
librdmacm: assuming: 4
These are warnings that will not affect your job at all.
The StdOut file (pi_MPI.o4050) shows:
pi is 3.14159265358818 Error is 1.614264277804978E-012 time is 4.13646793365479 seconds
Congratulations, you have successfully run a program on the cluster!