The filenames, paths, email addresses, and some values below are things you will probably need to change. In some cases, values/names have been used to demonstrate possibilities that you could employ (in a slightly different way). Apart from the -l
options, no option should appear on multiple lines.
Directive(s) |
Description of purpose |
---|---|
|
No checkpointing to be performed. |
|
Defines the working directory path to be used for the job. |
|
Merge standard output and standard error streams into the named file. |
|
Request that 4GB of memory per CPU core be reserved for the batch job. |
|
Request 2 CPU cores that can be used for interactive job(s). |
|
Send mail at batch job abort/exit to the Email address provided. |
|
Assign a name ( |
|
Specify the queue into which your job will be placed. |
|
Export environment variables to the batch job |
While defaults exist for many options, HPC staff ask researchers to specify CPU core, memory, and walltime requirements as accurately as possible.
A -W
option can be used for more complicated tasks such as job dependencies, stage-in and stage-out. Researchers may wish to consult with HPC staff with regard to use of the -W
options. A man qsub
will provide more information and more options than provided above.
Users interested in protecting there job runs with checkpointing should realize that this feature comes at a cost (I/O operations). Checkpoint restart of a job (using BLCR) will not work for all job types. HPC staff advise use to test this feature on a typical job first before using it on other similar jobs. Generally speaking, checkpointing will only be a real benefit to jobs that run for over a week.
The variables listed in the table below are commonly used within a PBS script file.
Variable |
Description |
---|---|
|
Job name specified by the user |
|
Working directory from which the job was submitted |
|
Home directory of user submitting the job |
|
Name of user submitting the job |
|
Script shell |
|
Unique PBS job id |
|
Host on which job script is running |
|
Name of the job queue |
|
File containing line delimited list on nodes allocated to the job |
|
Path variable used to locate executables within the job script |
Note: On multi-core systems, a node (line in PBS_NODEFILE
) will identify the hostname and a CPU core.
This example runs PAUP on the input file input.nex
that resides in the current working directory. A file (here we'll name it pbsjob
) is created with the contents:
#!/bin/bash #PBS -c s #PBS -j oe #PBS -m ae #PBS -N jobname #PBS -l pmem=5gb #PBS -l nodes=1:ppn=1 #PBS -l walltime=168:00:00 #PBS -M your.name@jcu.edu.au ncpu=`wc -l $PBS_NODEFILE | awk '{print $1}'` echo "------------------------------------------------------" echo " This job is allocated "$ncpu" CPU cores on " cat $PBS_NODEFILE | uniq echo "------------------------------------------------------" echo "PBS: Submitted to $PBS_QUEUE@$PBS_O_HOST" echo "PBS: Working directory is $PBS_O_WORKDIR" echo "PBS: Job identifier is $PBS_JOBID" echo "PBS: Job name is $PBS_JOBNAME" echo "------------------------------------------------------" cd $PBS_O_WORKDIR source /etc/profile.d/modules.sh module load paup paup -n input.nex
To submit the job for execution on a HPRC compute node simply enter the command:
qsub pbsjob
Using Job Arrays
Users with a knowledge of shell scripting (e.g., bash
) may choose to take advantage of job arrays. This feature significantly reduces load on our Torque/Maui server (compared to lots of individual job submissions). The example below (assume the file name is pbsjob
), will only be useful as a guide
#!/bin/bash #PBS -c s #PBS -j oe #PBS -m ae #PBS -N jobarray #PBS -M your.name@jcu.edu.au #PBS -l pmem=2gb #PBS -l walltime=9:00:00 cd $PBS_O_WORKDIR source /etc/profile.d/modules.sh module load matlab matlab -r myjob$PBS_ARRAYID
Issuing the command
qsub -S /bin/bash -t 1-8 pbsjob
will see 8 jobs run under one major identifier. To view status of individual jobs in the array. The above example is identical (in terms of what jobs would be executed) to the one in the "Do It Yourself" section below.
Chances are you may need more advanced features of the scripting language than what is shown above. HPRC staff will endeavour to provide assistance with job arrays, if requested.
Do It Yourself
There are several legitimate reasons for wanting to run multiple single processor jobs in parallel within a single PBS script. For example, you may want to run 8 MATLAB jobs which require a toolbox that only has 4 licensed users. Only 1 MATLAB license is checked out if all 8 jobs are run on the same system. An example PBS script to do this task would look like
#!/bin/bash #PBS -c s #PBS -j oe #PBS -m ae #PBS -N jobname #PBS -M your.name@jcu.edu.au #PBS -l walltime=1000:00:00 #PBS -l nodes=1:ppn=8 #PBS -l pmem=3gb ncpu=`wc -l $PBS_NODEFILE` echo "------------------------------------------------------" echo " This job is allocated "$ncpu" CPU cores on " cat $PBS_NODEFILE | uniq echo "------------------------------------------------------" echo "PBS: Submitted to $PBS_QUEUE@$PBS_O_HOST" echo "PBS: Working directory is $PBS_O_WORKDIR" echo "PBS: Job identifier is $PBS_JOBID" echo "PBS: Job name is $PBS_JOBNAME" echo "------------------------------------------------------" cd $PBS_O_WORKDIR source /etc/profile.d/modules.sh module load matlab matlab -r myjob1 & matlab -r myjob2 & matlab -r myjob3 & matlab -r myjob4 & matlab -r myjob5 & matlab -r myjob6 & matlab -r myjob7 & matlab -r myjob8 & wait # Wait for background jobs to finish.
Note that the above job would be allocated 8 CPU cores and 24GB of memory.
To submit the job for execution on a HPRC compute node simply enter the command:
qsub pbsjob
Note: The echo
commands in the PBS script example above are informational only.
#!/bin/bash #PBS -V #PBS -m abe #PBS -N migrate #PBS -l pmem=2gb #PBS -l nodes=1:ppn=20 #PBS -l walltime=240:00:00 #PBS -M your.email@my.jcu.edu.au cd #PBS_O_WORKDIR module load openmpi module load migrate mpirun -np 20 -machinefile $PBS_NODEFILE migrate-n-mpi ...
Note that the above job would be allocated 20 CPU cores and 40GB of memory.
A standard compute node in the JCU HPC cluster now has approximately 3GB of memory per configured core. The following table contains a number of examples of PBS options/directives that should be used for the given memory requirement of the job in question.
Resources Required for job | PBS Resources Request |
---|---|
1 CPU core, 3GB memory |
|
1 CPU core, 8GB memory |
|
1 CPU core, 20GB memory | - |
2 CPU cores, 6GB memory | - |
2 CPU cores, 10GB memory | - |
4 CPU cores, 12GB memory | - |
6 CPU cores, 24GB memory | - |
12 CPU cores, 60GB memory | - |
20 CPU cores, 60GB memory | - |
Note that the above table only contains a discrete number of examples. HPC cluster compute nodes have been configured to only provide 20 CPU cores and 60GB of memory to users' jobs. This was done in June 2014 to try and maintain resources for critical system processes.
Big memory nodes have approximately 5.5GB of memory per CPU core configured inside Torque. The bigmem
queue will need to be used when your PBS job requires more than 60GB of memory.
Resources Required for job | PBS Resources Request |
---|---|
1 CPU core, 128GB memory | - |
4 CPU cores, 96GB memory | - |
12 CPU cores, 120GB memory | - |
24 CPU cores, 240GB memory | - |
Use mb
units if your want/need a more precise memory per core ratio.