You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 33 Next »

    The filenames, paths, email addresses, and some values below are things you will probably need to change. In some cases, values/names have been used to demonstrate possibilities that you could employ (in a slightly different way). Apart from the -l options, no option should appear on multiple lines.

    Directive(s)

    Description of purpose

    #PBS -c n
    #PBS -c s
    #PBS -c enabled

    No checkpointing to be performed.
    Checkpointing is to be done on a job at pbs_mom shutdown.
    Checkpointing is allowed but must be explicitly invoked by a qhold or qchkpt command.

    #PBS -d /fast/jc123456

    Defines the working directory path to be used for the job.

    #PBS -j oe
    #PBS -o /tmp/output.$PBS_O_JOBID

    Merge standard output and standard error streams into the named file.

    #PBS -l pmem=8gb
    #PBS -l nodes=1:ppn=2
    #PBS -l walltime=24:00:00

    Request that 4GB of memory be reserved for the batch job.
    Request that 2 CPU cores on 1 host be reserved for the batch job.
    Advise the scheduler that this job will have completed within 24 hours.

    #PBS -l nodes=2 -I -X

    Request 2 CPU cores that can be used for interactive job(s).
    Note: Our 2 login nodes each provide 18 CPU cores and 64GB of memory for running interactive jobs (without qsub).

    #PBS -m ae
    #PBS -M john.doe@jcu.edu.au
    #PBS -M joe.blogg@my.jcu.edu.au

    Send mail at batch job abort/exit to the Email address provided.

    #PBS -N job_name

    Assign a name (job_name) to the batch job

    #PBS -q normal
    #PBS -q bigmem

    Specify the queue into which your job will be placed.
    Note: The bigmem queue targets two nodes only, long delays can be experienced before your job is run.

    #PBS -V

    Export environment variables to the batch job

    While defaults exist for many options, HPC staff ask researchers to specify CPU core, memory, and walltime requirements as accurately as possible.

    A -W option can be used for more complicated tasks such as job dependencies, stage-in and stage-out. Researchers may wish to consult with HPC staff with regard to use of the -W options. A man qsub will provide more information and more options than provided above.

    Users interested in protecting there job runs with checkpointing should realize that this feature comes at a cost (I/O operations). Checkpoint restart of a job (using BLCR) will not work for all job types. HPC staff advise use to test this feature on a typical job first before using it on other similar jobs. Generally speaking, checkpointing will only be a real benefit to jobs that run for over a week.

    The variables listed in the table below are commonly used within a PBS script file.

    Variable

    Description

    PBS_JOBNAME

    Job name specified by the user

    PBS_O_WORKDIR

    Working directory from which the job was submitted

    PBS_O_HOME

    Home directory of user submitting the job

    PBS_O_LOGNAME

    Name of user submitting the job

    PBS_O_SHELL

    Script shell

    PBS_O_JOBID

    Unique PBS job id

    PBS_O_HOST

    Host on which job script is running

    PBS_QUEUE

    Name of the job queue

    PBS_NODEFILE

    File containing line delimited list on nodes allocated to the job

    PBS_O_PATH

    Path variable used to locate executables within the job script

    Note: On multi-core systems, a node (line in PBS_NODEFILE) will identify the hostname and a CPU core.

    This example runs PAUP on the input file input.nex that resides in the current working directory. A file (here we'll name it pbsjob) is created with the contents:

    #!/bin/bash
    #PBS -c s
    #PBS -j oe
    #PBS -m ae
    #PBS -N jobname
    #PBS -l pmem=5gb
    #PBS -l walltime=500:00:00
    #PBS -M your.name@jcu.edu.au
    
    ncpu=`wc -l $PBS_NODEFILE | awk '{print $1}'`
    echo "------------------------------------------------------"
    echo " This job is allocated "$ncpu" CPU cores on "
    cat $PBS_NODEFILE | uniq
    echo "------------------------------------------------------"
    echo "PBS: Submitted to $PBS_QUEUE@$PBS_O_HOST"
    echo "PBS: Working directory is $PBS_O_WORKDIR"
    echo "PBS: Job identifier is $PBS_JOBID"
    echo "PBS: Job name is $PBS_JOBNAME"
    echo "------------------------------------------------------"
     
    cd $PBS_O_WORKDIR
    source /etc/profile.d/modules.sh
    module load paup
    paup -n input.nex
    

    To submit the job for execution on a HPRC compute node simply enter the command:

    qsub pbsjob

    If you know this job will require more than 4GB but less than 8GB of RAM, you could use the command:

    qsub -l nodes=1:ppn=2 pbsjob

    If you know this job will require more than 8GB but less than 16GB of RAM, you could use the command:

    qsub -l nodes=1:ppn=8 pbsjob

    The reason for the special cases (latter two) is to guarantee memory resources for your job. If memory on a node is overallocated, swap will be used. Job(s) that are actively using swap (disk) to simulate memory could take more than 1000 times longer to finish than a job running on dedicated memory. In most cases, this will mean your job will never finish.

    Using Job Arrays

    Users with a knowledge of shell scripting (e.g., bash) may choose to take advantage of job arrays. This feature significantly reduces load on our Torque/Maui server (compared to lots of individual job submissions). The example below (assume the file name is pbsjob), will only be useful as a guide

    #!/bin/bash
    #PBS -c s
    #PBS -j oe
    #PBS -m ae
    #PBS -N jobarray
    #PBS -M your.name@jcu.edu.au
    #PBS -l pmem=2gb
    #PBS -l walltime=9:00:00
    
    cd $PBS_O_WORKDIR
    source /etc/profile.d/modules.sh
    module load matlab
    matlab -r myjob$PBS_ARRAYID
    

    Issuing the command

    qsub -S /bin/bash -t 1-8 pbsjob

    will see 8 jobs run under one major identifier. To view status of individual jobs in the array. The above example is identical (in terms of what jobs would be executed) to the one in the "Do It Yourself" section below.

    Chances are you may need more advanced features of the scripting language than what is shown above. HPRC staff will endeavour to provide assistance with job arrays, if requested.

    Do It Yourself

    There are several legitimate reasons for wanting to run multiple single processor jobs in parallel within a single PBS script. For example, you may want to run 8 MATLAB jobs which require a toolbox that only has 4 licensed users. Only 1 MATLAB license is checked out if all 8 jobs are run on the same system. An example PBS script to do this task would look like

    #!/bin/bash
    #PBS -c s
    #PBS -j oe
    #PBS -m ae
    #PBS -N jobname
    #PBS -M your.name@jcu.edu.au
    #PBS -l walltime=1000:00:00
    #PBS -l nodes=1:ppn=8
    #PBS -l pmem=32gb
    
    ncpu=`wc -l $PBS_NODEFILE`
    echo "------------------------------------------------------"
    echo " This job is allocated "$ncpu" CPU cores on "
    cat $PBS_NODEFILE | uniq
    echo "------------------------------------------------------"
    echo "PBS: Submitted to $PBS_QUEUE@$PBS_O_HOST"
    echo "PBS: Working directory is $PBS_O_WORKDIR"
    echo "PBS: Job identifier is $PBS_JOBID"
    echo "PBS: Job name is $PBS_JOBNAME"
    echo "------------------------------------------------------"
    
    cd $PBS_O_WORKDIR
    source /etc/profile.d/modules.sh
    module load matlab
    matlab -r myjob1 &
    matlab -r myjob2 &
    matlab -r myjob3 &
    matlab -r myjob4 &
    matlab -r myjob5 &
    matlab -r myjob6 &
    matlab -r myjob7 &
    matlab -r myjob8 &
    wait    # Wait for background jobs to finish.
    

    To submit the job for execution on a HPRC compute node simply enter the command:

    qsub pbsjob

    Note: The echo commands in the PBS script example above are informational only.

    #!/bin/bash
    #PBS -V
    #PBS -m abe
    #PBS -N migrate
    #PBS -l pmem=62GB
    #PBS -l nodes=1:ppn=24
    #PBS -l walltime=240:00:00
    #PBS -M your.email@my.jcu.edu.au
    cd #PBS_O_WORKDIR
    module load openmpi
    module load migrate
    mpirun -np 24 -machinefile $PBS_NODEFILE migrate-n-mpi ...
    
    • No labels