Batch Job Control
From UMaine Supercomputer
Contents |
Status Checks
Moab or Torque
Everything that Torque can do, Moab can as well. Moab is typically more verbose and not needed if your job is in an expected state. However, if something unexpected is happening, many times you can discover why using the Moab commands. Remember, Moab is the scheduler, which means it makes the actual decisions on when to run jobs or if jobs are valid. Torque only reports on resource status, it really only knows what nodes are in use and if a job is running or not.
NOTE: Due to the OS Provisioning code, when a job is waiting for nodes to reboot, they will appear as queued in Torque and running in Moab.
Show the Job Queue
/usr/local/pbs/bin/qstat (Queue Status)
This is the Torque command, excelling at a quick check to see if a job is running or not. A couple of useful arguments are qstat -n which shows the nodes a job is running on and qstat -a which gives a little more information. We'll discuss it later. As shown below, job 1431 is running in the linux-batch queue on both of node255's processors. Job 1432 is queued (waiting for OS provisioning checks in fact) and will run in the queue linux-spool.
user@panopticon ~/src $ qsub go.hostname
1432.echelon.acrl.clusters.umaine.edu
user@panopticon ~/src $ qstat
Job id Name User Time Use S Queue
------------------- ---------------- --------------- -------- - -----
1431.echelon ...I1B_1w_300K_2 user 00:15:00 R linux-batch
1432.echelon go.hostname user 0 Q linux-spool
user@panopticon ~/src $ qstat -a
echelon.acrl.clusters.umaine.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------------------- -------- -------- ---------- ------ ----- --- ------ ----- - -----
1431.echelon.acrl.cl user linux-ba md_6I1B_1w 13442 1 -- -- 200:0 R 00:19
1432.echelon.acrl.cl user linux-sp go.hostnam -- 2 -- -- 00:30 Q --
user@panopticon ~/src $ qstat -n
echelon.acrl.clusters.umaine.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------------------- -------- -------- ---------- ------ ----- --- ------ ----- - -----
1431.echelon.acrl.cl user linux-ba md_6I1B_1w 13442 1 -- -- 200:0 R 00:19
node255/1+node255/0
1432.echelon.acrl.cl user linux-sp go.hostnam -- 2 -- -- 00:30 Q --
--
- Obsolete Commands
- qstat -f: You may as well use checkjob as detailed below
- qstat -q: This used to give information on the queues and their settings. This is controlled from Moab now and Torque cannot see the details.
/opt/moab/bin/showq (Show Queue)
This is the Moab variant of qstat, and has a few advantages. First, you get to see a count of used and free processors. Also, where Torque only assigns two states to a job, running or queued, Moab knows of three: active jobs (running), eligible jobs (jobs that are valid, but need to wait for resources) and blocked jobs (jobs that have problems with the resources required and probably will never run). These three states are a great advantage when wondering why your job still hasn't run when you think it should have by now. If your job is eligible, the cluster is simply busy and you're just waiting in line. If the job is blocked, it's time to move on to more verbose tools.
user@panopticon ~/src $ showq
active jobs------------------------
JOBID USERNAME STATE PROC REMAINING STARTTIME
1431 user Running 2 8:07:33:01 Thu Sep 14 09:23:09
1433 user Running 4 00:29:59 Thu Sep 14 09:50:07
2 active jobs 6 of 564 processors in use by local jobs (1.06%)
3 of 282 nodes active (1.06%)
eligible jobs----------------------
JOBID USERNAME STATE PROC WCLIMIT QUEUETIME
0 eligible jobs
blocked jobs-----------------------
JOBID USERNAME STATE PROC WCLIMIT QUEUETIME
0 blocked jobs
Total jobs: 2
/opt/moab/bin/showstate
Shows the state of the entire cluster including locations and states of all nodes. The only thing missing from this display is a list of queued jobs. But it's a nice way to get an idea of the physical state of the cluster and where jobs are running. Also, lists nodes that have been marked as down. Below node53 is down (although Torque doesn't know this) because it failed the Myrinet connection tests. Nodes61-64 have been partitioned off to the secure clusters and are not visible to the University side. Node109 and 126 are both missing because we've used them for NFS servers. The X's mark nonexistent nodes.
user@panopticon ~/src $ showstate
cluster state summary for Thu Sep 14 10:51:04
JobID S User Group Procs Remaining StartTime
------------------ - --------- -------- ----- ----------- -------------------
(A) 1431 R user cluster_users 2 8:06:32:05 Thu Sep 14 09:23:09
usage summary: 1 active jobs 1 active nodes
[0][0][0][0][0][0][0][0][0][1][1][1][1][1][1][1][1][1][1][2]
[1][2][3][4][5][6][7][8][9][0][1][2][3][4][5][6][7][8][9][0]
Rack 12: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 13: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][#][ ][ ][ ][ ][ ][ ][!]
Rack 14: [#][#][#][#]XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[ ][ ][ ][ ]
Rack 15: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 16: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 17: [#][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][#][ ][ ]
Rack 31: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 32: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 33: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 34: [ ][ ][ ][ ]XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[ ][ ][ ][ ]
Rack 35: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 36: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 37: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][A][ ]
Rack 51: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 52: [ ][ ][ ][!][ ][ ][ ][ ][ ][ ][ ][!][!][!][!][!]XXXXXXXXXXXX
Rack 54: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[ ][ ]XXXXXXXXXXXXXXXXXXXXXXXX
Rack 55: [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]
Rack 56: [ ][ ]XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Key: [?]:Unknown [*]:Down w/Job [#]:Down [ ]:Idle [@] Busy w/No Job [!] Drained
node node53 is down
node node61 is down
node node62 is down
node node63 is down
node node64 is down
node node109 is down
node node126 is down
Job Information
/opt/moab/bin/checkjob
If you need more information about a job that you are running, this is the command for you. This is probably your best resource for answering the age old question "Why isn't my job running?" This informs you of the exact resources your job requires and reports any errors preventing the job from running. The more verbose checkjob -v is sometimes helpful as well.
- A successfully running job
user@panopticon ~/src $ checkjob 1431 job 1431 AName: md_6I1B_1w_300K_2 State: Running Creds: user:user group:cluster_users account:mypbs_account class:linux-batch WallTime: 1:04:16 of 8:08:00:00 SubmitTime: Thu Sep 14 09:22:32 (Time Queued Total: 00:00:37 Eligible: -00:00:01) StartTime: Thu Sep 14 09:23:09 Total Requested Tasks: 2 Req[0] TaskCount: 2 Partition: RACK3 Memory >= 0 Disk >= 0 Swap >= 0 Opsys: --- Arch: --- Features: --- NodeSet=ONEOF:FEATURE:RACK3:RACK1 Allocated Nodes: [node255:2] StartCount: 1 Flags: RESTARTABLE Attr: checkpoint StartPriority: 64 Reservation '1431' (-1:04:06 -> 8:06:55:54 Duration: 8:08:00:00)
- A blocked job as arch=x86-32 was requested, but submitted to the PPC64 queues
user@panopticon ~/src $ checkjob 1435 job 1435 AName: go.hostname State: Idle Creds: user:user group:cluster_users account:mypbs_account class:linux-spool WallTime: 00:00:00 of 00:30:00 SubmitTime: Thu Sep 14 10:30:01 (Time Queued Total: 00:00:57 Eligible: 00:00:31) Total Requested Tasks: 4 Req[0] TaskCount: 4 Partition: ALL Memory >= 0 Disk >= 0 Swap >= 0 Opsys: --- Arch: x86-32 Features: myrinet NodeSet=ONEOF:FEATURE:RACK3:RACK1 Flags: RESTARTABLE Attr: checkpoint StartPriority: 1 NOTE: cannot select job for partition RACK1 (partition RACK1 does not support requested class linux-spool) NOTE: job cannot run in partition RACK3 (idle procs do not meet requirements : 0 of 4 procs found) idle procs: 253 feasible procs: 0 Node Rejection Summary: [State: 1][Arch: 127] NOTE: cannot select job for partition KEARNEY (partition KEARNEY does not support requested class linux-spool)
- A job blocked because it requested too many nodes (Note the use of checkjob -v as checkjob was not informative enough)
user@panopticon ~/src $ checkjob -v 1436 job 1436 (RM job '1436.echelon.acrl.clusters.umaine.edu') AName: go.hostname State: Idle Creds: user:user group:cluster_users account:systemTest class:linux-spool WallTime: 00:00:00 of 00:30:00 SubmitTime: Thu Sep 14 10:32:33 (Time Queued Total: 00:00:22 Eligible: -00:00:01) Total Requested Tasks: 260 Total Requested Nodes: 130 Req[0] TaskCount: 260 Partition: ALL Memory >= 0 Disk >= 0 Swap >= 0 Opsys: --- Arch: --- Features: myrinet Dedicated Resources Per Task: PROCS: 1 NodeSet=ONEOF:FEATURE:RACK3:RACK1 NodeAccess: SINGLEJOB TasksPerNode: 2 NodesRequested: 130 OutputFile: - (panopticon:/admin/home/user/src/out) ErrorFile: - (panopticon:/admin/home/user/src/err) User Specified PartitionMask: [base][RACK1][RACK3][KEARNEY] System Available PartitionMask: [base][RACK1][RACK3][KEARNEY] PartitionMask: [ALL] Flags: RESTARTABLE Attr: checkpoint StartPriority: 1 PE: 260.00 Holds: Batch:PolicyViolation NOTE: job cannot run (job has hold in place) NOTE: cannot select job for partition RACK1 (partition RACK1 does not support requested class linux-spool) NOTE: cannot select job for partition RACK3 (partition RACK3 has insufficient instances of requested class linux-spool configured (256 < 260)) NOTE: cannot select job for partition KEARNEY (partition KEARNEY does not support requested class linux-spool) NOTE: job hold active - Batch Message[0] partition KEARNEY does not support requested class linux-spool
General Information
/opt/moab/bin/mdiag
Moab Diagnose: This command can be used to print out details about every aspect of the cluster. Simply typing mdiag into the shell will give a list of commands. Below I've listed the ones I commonly use. As always, adding a -v will increase the verboseness of the output.
- mdiag -t
- Shows details on the partitions in the cluster.
- We have three partitions currently: RACK1 and RACK3 which are Xserve nodes and KEARNEY, the PIII's
- Jobs are not allowed to span across partitions. This would cause major performance issues.
user@panopticon ~/src $ mdiag -t
Partition Status
System Partition Settings: PList: base,RACK1,RACK3,KEARNEY PDef: base
Name Procs
ALL 592
base 0
RM=base
RACK1 216
RM=base
RACK3 256
RM=base
KEARNEY 120
RM=base
Partition Configured Up U/C Dedicated D/U Active A/U
Nodes ----------------------------------------------------------------------------
ALL 296 282 95.27% 1 0.35% 1 0.35%
RACK1 108 100 92.59% 0 0.00% 0 0.00%
RACK3 128 128 100.00% 1 0.78% 1 0.78%
KEARNEY 60 54 90.00% 0 0.00% 0 0.00%
Processors ----------------------------------------------------------------------------
ALL 592 564 95.27% 2 0.35% 3 0.53%
RACK1 216 200 92.59% 0 0.00% 0 0.00%
RACK3 256 256 100.00% 2 0.78% 3 1.17%
KEARNEY 120 108 90.00% 0 0.00% 0 0.00%
Memory (in MB) ----------------------------------------------------------------------------
ALL 521490 511662 98.12% 1973 0.39% 0 0.00%
RACK1 207852 203825 98.06% 0 0.00% 0 0.00%
RACK3 252844 252844 100.00% 1973 0.78% 0 0.00%
KEARNEY 60794 54993 90.46% 0 0.00% 0 0.00%
Swap (in MB) ----------------------------------------------------------------------------
ALL 645507 634771 98.34% 2833 0.45% 0 0.00%
RACK1 210662 205727 97.66% 0 0.00% 0 0.00%
RACK3 374051 374051 100.00% 2833 0.76% 0 0.00%
KEARNEY 60794 54993 90.46% 0 0.00% 0 0.00%
Disk (in MB) ----------------------------------------------------------------------------
ALL 296 282 95.27% 1 0.35% 0 0.00%
RACK1 108 100 92.59% 0 0.00% 0 0.00%
RACK3 128 128 100.00% 1 0.78% 0 0.00%
KEARNEY 60 54 90.00% 0 0.00% 0 0.00%
Classes/Queues
[<CLASS> <AVAIL>:<UP>]...
ALL [default 564:564][linux-spool 256:256][darwin-spool 200:200][linux-batch 254:256][darwin-batch 200:200][linux-admin 256:256][darwin-admin 200:200][kearney 564:564]
RACK1 [default 200:200][darwin-spool 200:200][darwin-batch 200:200][darwin-admin 200:200][kearney 200:200]
RACK3 [default 256:256][linux-spool 256:256][linux-batch 254:256][linux-admin 256:256][kearney 256:256]
KEARNEY [default 108:108][kearney 108:108]
- mdiag -c
- Displays the details of all or a certain class (queue).
- This is the replacement for qstat -q, as Torque does not control queues anymore.
- Below are the details on the queue linux-spool
- It can only run on either RACK3 or RACK1, with preference for RACK3
- Commands to run before (PROLOG) and after (EPILOG) the job are listed. You can ignore these.
- Only one job is allowed per user in this queue.
- The maximum walltime for a job in this queue is 2 days.
user@panopticon ~/src $ mdiag -c linux-spool
Class/Queue Status
ClassID Priority Flags QDef QOSList* PartitionList Target Limits
linux-spool 1 --- --- --- --- 0.00 ---
DEFAULT.NODESET=ONEOF:FEATURE:RACK3:RACK1 JOBEPILOG=/usr/local/sbin/epilogue JOBPROLOG=/usr/local/sbin/nb_ctl -j $JOBID -h
$HOSTLIST -i linux:gentoo-2.6.16.27-v3-trim MAXJOBPERUSER=1,1 MAX.WCLIMIT=2:00:00:00
- mdiag -n
- Displays information on all/a specific node.
- Adding a -v helps a lot, but checknode may be used as well.
- Typically only useful to a user so they can report problems. One that pops up somewhat regularly is runaway processes on nodes. Per each unit of load, Moab marks that processor on the node as unusable. This can cause what should be free nodes to be unusable.
user@panopticon ~/src $ mdiag -n node129
compute node summary
Name State Procs Memory Opsys
node129 Idle 2:2 1973:1973 linux
----- --- 2:2 1973:1973 -----
Total Nodes: 1 (Active: 0 Idle: 1 Down: 0)
user@panopticon ~/src $ mdiag -v -n node129
compute node summary
Name State Procs Memory Disk Swap Speed Opsys Arch Par Load Rsv Classes
Network Features
node129 Idle 2:2 1973:1973 1:1 2956:2956 1.00 linux ppc64 RAC 0.00 0
[default_2:2][linux-spool_2:2][linux-batch_2:2][linux-admin_2:2][kearney_2:2] [DEFAULT] myrinet,pRACK3,31,RACK3
----- --- 2:2 1973:1973 1:1 2956:2956
Total Nodes: 1 (Active: 0 Idle: 1 Down: 0)

