...
Card |
---|
|
The filenames, paths, email addresses, and some values below are things you will probably need to change. In some cases, values/names have been used to demonstrate possibilities that you could employ (in a slightly different way). Apart from the -l options, no option should appear on multiple lines. Directive(s) | Description of purpose |
---|
#PBS -c n #PBS -c s #PBS -c enabled
| No checkpointing to be performed. Checkpointing is to be done on a job at pbs_mom shutdown. Checkpointing is allowed but must be explicitly invoked by a qhold or qchkpt command. | #PBS -d /fast/jc123456
| Defines the working directory path to be used for the job. | #PBS -j oe #PBS -o /tmp/output.$PBS_O_JOBID
| Merge standard output and standard error streams into the named file. | #PBS -l pmem=4gb #PBS -l nodes=1:ppn=2 #PBS -l walltime=24:00:00
| Request that 4GB of memory per CPU core be reserved for the batch job. Request that 2 CPU cores on 1 host be reserved for the batch job. Advise the scheduler that this job will have completed within 24 hours. | #PBS -l nodes=2 -I -X
| Request 2 CPU cores that can be used for interactive job(s). Note: Our 2 login nodes each provide 18 CPU cores and 64GB of memory for running interactive jobs (without qsub ). | #PBS -m ae #PBS -M john.doe@jcu.edu.au #PBS -M joe.blogg@my.jcu.edu.au
| Send mail at batch job abort/exit to the Email address provided. | #PBS -N job_name
| Assign a name (job_name ) to the batch job | #PBS -q normal #PBS -q bigmem
| Specify the queue into which your job will be placed. Note: The bigmem queue targets two nodes only, long delays can be experienced before your job is run. | #PBS -V
| Export environment variables to the batch job |
While defaults exist for many options, HPC staff ask researchers to specify CPU core, memory, and walltime requirements as accurately as possible. A -W option can be used for more complicated tasks such as job dependencies, stage-in and stage-out. Researchers may wish to consult with HPC staff with regard to use of the -W options. A man qsub will provide more information and more options than provided above. Users interested in protecting there job runs with checkpointing should realize that this feature comes at a cost (I/O operations). Checkpoint restart of a job (using BLCR) will not work for all job types. HPC staff advise use to test this feature on a typical job first before using it on other similar jobs. Generally speaking, checkpointing will only be a real benefit to jobs that run for over a week. |
Card |
---|
|
The variables listed in the table below are commonly used within a PBS script file. Variable | Description |
---|
PBS_JOBNAME
| Job name specified by the user | PBS_O_WORKDIR
| Working directory from which the job was submitted | PBS_O_HOME
| Home directory of user submitting the job | PBS_O_LOGNAME
| Name of user submitting the job | PBS_O_SHELL
| Script shell | PBS_O_JOBID
| Unique PBS job id | PBS_O_HOST
| Host on which job script is running | PBS_QUEUE
| Name of the job queue | PBS_NODEFILE
| File containing line delimited list on nodes allocated to the job | PBS_O_PATH
| Path variable used to locate executables within the job script |
Note: On multi-core systems, a node (line in PBS_NODEFILE ) will identify the hostname and a CPU core. |
...
This example runs PAUP on the input file input.nex
that resides in the current working directory. A file (here we'll name it pbsjob
) is created with the contents:
Important: Most software will only consume 1 CPU core - e.g., requesting 8 CPU cores for a PAUP job blocks other people using the unused 7 CPU cores. Example 1 below would likely be the most users should be basing their job scripts from. If in doubt, contact HPRC staff.
For more information about PBSPro please click to see guide. For a brief description of PBS directives provided in examples below, see the "Brief Explanation of PBS directive used in examples above" section immediately following the final example PBS script.
HPC staff should be able to assist researchers needing help with PBS scripts.
Singularity Example
The following PBS script requests 1 CPU core, 2GB of memory, and 24 hours of walltime
No Format |
---|
#!/bin/bash
#PBS -j oe
#PBS -m ae
#PBS -N JobName1
#PBS -M FIRSTNAME.LASTNAME@jcu.edu.au
#PBS -l walltime=24:00:00
#PBS -l select=1:ncpus=1:mem=2gb
cd $PBS_O_WORKDIR
shopt -s expand_aliases
source /etc/profile.d/modules.sh
echo "Job identifier is $PBS_JOBID"
echo "Working directory is $PBS_O_WORKDIR"
module load singularity
singularity run $SING/R-4.1.1.sif R |
Module Examples
Section |
---|
Column |
---|
| Example 1:The following PBS script requests 1 CPU core, 2GB of memory, and 24 hours of walltime for the running of "paup -n input.nex". No Format |
---|
#!/bin/bash
#PBS -j oe
#PBS -m ae
#PBS -N JobName1
#PBS -M FIRSTNAME.LASTNAME@jcu.edu.au
#PBS -l walltime=24:00:00
#PBS -l select=1:ncpus=1:mem=2gb
cd $PBS_O_WORKDIR
shopt -s expand_aliases
source /etc/profile.d/modules.sh
echo "Job identifier is $PBS_JOBID"
echo "Working directory is $PBS_O_WORKDIR"
module load paup
paup -n input.nex |
If the file containing the above content has a name of JobName1.pbs , you simply execute qsub JobName1.pbs to place it into the queueing system. Example 3:The following PBS script requests 20 CPU cores, 60GB of memory, and 10 days of walltime for running of an MPI job. No Format |
---|
#!/bin/bash
#PBS -c s
#PBS -j oe
#PBS -m ae
#PBS -N jobnameJobName3
#PBS -l pmem=5gbM FIRSTNAME.LASTNAME@my.jcu.edu.au
#PBS -l nodeswalltime=1:ppn=1240:00:00
#PBS -l walltimeselect=1681:00:00
#PBS -M your.name@jcu.edu.au
ncpu=`wc -l $PBS_NODEFILE | awk '{print $1}'`
echo "------------------------------------------------------"
echo " This job is allocated "$ncpu" CPU cores on "
cat $PBS_NODEFILE | uniq
echo "------------------------------------------------------"
echo "PBS: Submitted to $PBS_QUEUE@$PBS_O_HOST"
echo "PBS: Working directory is $PBS_O_WORKDIR"
echo "PBS: ncpus=20:mem=60gb
cd $PBS_O_WORKDIR
shopt -s expand_aliases
source /etc/profile.d/modules.sh
echo "Job identifier is $PBS_JOBID"
echo "Working directory is $PBS_O_WORKDIR"
module load migrate
module load mpi/openmpi
mpirun -np 20 -machinefile $PBS_NODEFILE migrate-n-mpi ... |
If the file containing the above content has a name of JobName3.pbs , you simply execute qsub JobName3.pbs to place it into the queueing system. |
Column |
---|
| Example 2:The following PBS script requests 8 CPU cores, 32GB of memory, and 3 hours of walltime for running of 8 MATLAB jobs in parallel. No Format |
---|
#!/bin/bash
#PBS -j oe
#PBS -m ae
#PBS -N JobName2
#PBS -M FIRSTNAME.LASTNAME@my.jcu.edu.au
#PBS -l walltime=3:00:00
#PBS -l select=1:ncpus=8:mem=32gb
cd $PBS_O_WORKDIR
shopt -s expand_aliases
source /etc/profile.d/modules.sh
echo "Job identifier is $PBS_JOBID"
echo "PBS:Working Jobdirectory name is $PBS_JOBNAMEO_WORKDIR"
echo "------------------------------------------------------"
cd $PBS_O_WORKDIR
source /etc/profile.d/modules.sh
module load paup
paup -n input.nex
|
To submit the job for execution on a HPRC compute node simply enter the command: | Card |
---|
| Using Job ArraysUsers with a knowledge of shell scripting (e.g., bash ) may choose to take advantage of job arrays. This feature significantly reduces load on our Torque/Maui server (compared to lots of individual job submissions). The example below (assume the file name is pbsjob ), will only be useful as a guide
module load matlab
matlab -r myjob1 &
matlab -r myjob2 &
matlab -r myjob3 &
matlab -r myjob4 &
matlab -r myjob5 &
matlab -r myjob6 &
matlab -r myjob7 &
matlab -r myjob8 &
wait # Wait for background jobs to finish. |
If the file containing the above content has a name of JobName2.pbs , you simply execute qsub JobName2.pbs to place it into the queueing system. Example 4:The following PBS script request uses job arrays. If you aren't proficient with bash scripting, using job arrays could be painful. The example below has each sub-job requesting 1 CPU core, 1 GB of memory, and 80 minutes of walltime. No Format |
---|
#!/bin/bash
#PBS -c s
#PBS -j oe
#PBS -m ae
#PBS -N jobarrayArrayJob
#PBS -M yourFIRSTNAME.name@jcuLASTNAME@jcu.edu.au
#PBS -l pmem=2gbwalltime=1:20:00
#PBS -l walltimeselect=91:00:00ncpus=1:mem=1gb
cd $PBS_O_WORKDIR
shopt -s expand_aliases
source /etc/profile.d/modules.sh
module load matlab
matlab -r myjob$PBS_ARRAYID
|
Issuing the command No Format |
---|
qsub -S /bin/bash -t 1-8 pbsjob |
will see 8 jobs run under one major identifier. To view status of individual jobs in the array. The above example is identical (in terms of what jobs would be executed) to the one in the "Do It Yourself" section below. Chances are you may need more advanced features of the scripting language than what is shown above. HPRC staff will endeavour to provide assistance with job arrays, if requested. Do It YourselfThere are several legitimate reasons for wanting to run multiple single processor jobs in parallel within a single PBS script. For example, you may want to run 8 MATLAB jobs which require a toolbox that only has 4 licensed users. Only 1 MATLAB license is checked out if all 8 jobs are run on the same system. An example PBS script to do this task would look like If the file containing the above content has a name of ArrayJob.pbs and you will be running 32 sub-jobs, you simply use qsub -t 1-32 ArrayJob.pbs to place it into the queueing system. Note: I haven't done extensive testing of job arrays. |
|
Example 5:
The following script is a rework of Example 2 to use the /fast/tmp
filesystem for a hyperthetical workflow that is I/O intensive. This example assumes 1 output file per job.
Note |
---|
Usage of /fast/tmp Please make sure you first create an place all files in a folder that matches your jc number eg: jcXXXXXXXX |
No Format |
---|
#!/bin/bash
#PBS - c s
#PBS -j oe
#PBS -m ae
#PBS -N jobnameJobName2
#PBS -M yourFIRSTNAME. name@jcuLASTNAME@my.jcu.edu.au
#PBS -l walltime= 10003:00:00
#PBS -l nodesselect=1: ppnncpus=8
#PBS -l pmem:mem=32gb
ncpu=`wc -l cd $PBS_ NODEFILE`O_WORKDIR
echoshopt "------------------------------------------------------"
echo " This job is allocated "$ncpu" CPU cores on "
cat $PBS_NODEFILE | uniq
echo "------------------------------------------------------"
echo "PBS: Submitted to $PBS_QUEUE@$PBS_O_HOST"
echo "PBS: -s expand_aliases
source /etc/profile.d/modules.sh
echo "Job identifier is $PBS_JOBID"
echo "Working directory is $PBS_O_WORKDIR"
echo
"PBS: Job identifier is $PBS_JOBID"
echo "PBS: Job name is $PBS_JOBNAME"
echo "------------------------------------------------------"
cd $PBS_O_WORKDIR
source /etc/profile.d/modules.sh
mkdir -p /fast/tmp/jc012345/myjobs
cp -a myjob1.m myjob2.m myjob3.m myjob4.m myjob5.m myjob6.m myjob7.m myjob8.m /fast/tmp/jc012345/myjobs/
pushd /fast/tmp/jc012345/myjobs
module load matlab
matlab -r myjob1 &
matlab -r myjob2 &
matlab -r myjob3 &
matlab -r myjob4 &
matlab -r myjob5 &
matlab -r myjob6 &
matlab -r myjob7 &
matlab -r myjob8 &
wait # Wait for background jobs to finish.
To submit the job for execution on a HPRC compute node simply enter the command: Note: The echo commands in the PBS script example above are informational only. Card |
---|
| No Format |
---|
#!/bin/bash
#PBS -V
#PBS -m abe
#PBS -N migrate
#PBS -l pmem=62GB
#PBS -l nodes=1:ppn=24
#PBS -l walltime=240:00:00
#PBS -M your.email@my.jcu.edu.au
cd #PBS_O_WORKDIR
module load openmpi
module load migrate
mpirun -np 24 -machinefile $PBS_NODEFILE migrate-n-mpi ...
|
|
Card |
---|
|
|
A standard compute node in the JCU HPC cluster has approximately 2.5GB of memory per CPU core. The following table contains a number of examples of PBS options/directives that should be used for the given memory requirement of the job in question.
Memory Required for job | PBS Memory Request | PBS CPU Core (& Queue) Request |
---|
2.5GB | -l pmem=2500mb
| -l nodes=1:ppn=1 |
5.0GB | -l pmem=5000mb | -l nodes=1:ppn=2 |
7.5GB | -l pmem=7500mb | -l nodes=1:ppn=3 |
10GB | -l pmem=10gb | -l nodes=1:ppn=4 |
15GB | -l pmem=15gb | -l nodes=1:ppn=6 |
20GB | -l pmem=20gb | -l nodes=1:ppn=8 |
25GB | -l pmem=25gb | -l nodes=1:ppn=10 |
31GB | -l pmem=31gb | -l nodes=1:ppn=12 |
40GB | -l pmem=40gb | -l nodes=1:ppn=16 |
50GB | -l pmem=50gb | -l nodes=1:ppn=20 |
63GB | -l pmem=63gb | -l nodes=1:ppn=24 |
95GB | -l pmem 95gb | -l nodes=1:ppn=18 -q bigmem
|
126GB | -l pmem=126gb | -l nodes=1:ppn=24 -q bigmem |
190GB | -l pmem=190gb | -l nodes=1:ppn=36 -q bigmem |
254GB | -l pmem=254gb | -l nodes=1:ppn=48 -q bigmem |
Note that the above table only contains a discrete number of examples. Big memory nodes have approximately 5GB of memory per CPU core. The bigmem
queue should be used when you require more than 63GB of memory for your job.
Note: Multi-processor jobs may have core counts that don't match the memory request. For example, it is entirely possible that an MPI job that consumes 8 CPU cores only requires 2GB of memory.
cp -a out1.mat out2.mat out3.mat out4.mat out5.mat out6.mat out7.mat out8.mat $PBS_O_WORKDIR/
popd
rm -rf /fast/tmp/jc012345/myjobs
Consider the possibility that you may be running more than one workflow at any given time. Using subdirectories is a good way of segregating workflows (at a storage layer).
Brief Explanation of PBS directive used in examples above
Directive | Description of impact |
---|
#PBS -j oe | Merge STDOUT & STDERR streams into a single file |
#PBS -m ae | Send an Email upon job abort/exit. |
#PBS -N ... | Assign a meaningful name to the job (replace ... with 1 "word" - e.g., test_job). |
#PBS -M ...
| Email address that PBSPro will use to provide job information (if desired) |
#PBS -l walltime=HH:MM:SS | Amount of clock time that your job is likely to required. |
#PBS -l select=1:ncpus=X:mem=Ygb | Request 1 chunk of "X CPU cores" & "Y GB of RAM". The "select=1:" is not really required as it is the default. Due to JCU cluster size, requests for more than 1 chunk should/will be rejected. |
Directive(s) | Description of purpose |
---|
#PBS -d <PATH_TO_DIRECTORY>
| Sets the working directory for you job to <PATH>. |
#PBS -o <OUTPUT_FILE_PATH>
| Explicit specification of file that will hold the standard output stream from you job. |
#PBS -V
| Export environment variables to the batch job |
For full details on directives that can be used, use "man qsub
" on a HPC login node or look at online documentation for Torque.
PBS/Torque Variables
The following variables can be useful within your PBS job script. Some are present in the examples above.
Variable | Description |
---|
PBS_JOBNAME
| Job name specified by the user |
PBS_O_WORKDIR
| Working directory from which the job was submitted |
PBS_O_HOME
| Home directory of user submitting the job |
PBS_O_LOGNAME
| Name of user submitting the job |
PBS_O_SHELL
| Script shell |
PBS_O_JOBID
| Unique PBS job id |
PBS_O_HOST
| Host on which job script is running |
PBS_QUEUE
| Name of the job queue |
PBS_NODEFILE
| File containing line delimited list on nodes allocated to the job (may be required for MPI jobs). |
PBS_O_PATH
| Path variable used to locate executables within the job script |