Some programs are sensitive to how the layout of the processes on nodes. This is typically true of programs that have a lot of communication between many processes. The nodes of this cluster have 16 cores. If you have a program that you want to use 64 cores with, it may be that specifying 16 nodes, each running with only 4 cores is much faster than using 4 nodes, each running with all 16 cores. Likewise, there are some types of programs where the opposite is true so it can be important to be able to control this.
You can control the layout of a job with the “#PBS -l nodes=16,ppn=4″ directive. As in:
#PBS -q batch
#PBS -N pi_MPI
#PBS -l nodes=16,ppn=4
#PBS -l walltime=00:10:00
module load mvapich2/intel/1.8
where “nodes” refers to the number of nodes you want to allocate and “ppn” refers to the number of processes per node that you want. Multiplying these two numbers will give you the total number of cores that will be allocated to the job.
There are some things that you need to be aware of when doing this. Since the cluster is used by a lot of people and the scheduler tries to fit jobs in as efficiently as possible you may find that if you specify 4 nodes that use all 16 cores on each node your job may sit in the Queue for a long time before it starts running. This is because the scheduler needs to wait for four nodes to be free of all processes before your program can start. Depending on what is already running or queued this may take a while.You will have to weigh this into your decision about how to run the job. Maybe it will run faster once it starts but overall it might take longer to finish the job when you take the queued time into account.
The opposite of this is to use:
#PBS -l procs=64
This tells the scheduler “I want 64 cores and I don’t care where they are”. The scheduler will just choose the first free cores that are available and start your job. It may pick 8 nodes in all with only 2 processes on 1 node, 6 processes each on 5 nodes and 16 processes each on 2 nodes. This will be the fastest way to get your program running but depending on the communication requirements of your job it may lead to inconsistent run-times from job to job.