THIS PAGE CONTAINS OLD INFORMATION - HPC staff will work on updating it soon.
This page is intended as a quick introduction for new users submitting their first job to the HPRC Cluster.
In the typical work flow the user:
- Logs into zodiac.hpc.jcu.edu.au
- Prepares a submission script for their jobs
- Submits their jobs to the Job Scheduler
- Monitors their jobs
- Collects the output of their jobs.
A few things that new users should be aware of:
- Typically, jobs are not run in an interactive manner, except when:
- users are running small one off jobs
- evaluating the resources required for bigger jobs
- running applications that require a GUI like MATLAB
- using graphical applications like MATLAB
- Examples of interactive jobs:
Content by Label showLabels false showSpace false cql label = "interactive-cluster-job" labels interactive-cluster-job
- HPRC Cluster software is not run in a window on their desktop, neither is it launched by clicking on it in the a network drive.
- Users need to log into the cluster and inform the job scheduler about their job and it will run it when it can.
|Table of Contents|
The first step in using the HPRC Cluster is to log in to the login node -
|title||Logging into the Cluster|
The HPRC Cluster uses environment modules to manage the available software packages. This allows multiple versions of the same software to be installed without interfearing with each other. To enable the
environment module systemthe following command needs to be executed on the command line:
-bash-4.1$ source /etc/profile.d/modules.sh
The software that is available on the HPRC clusted is listed here: HPRC User Software. Alternately you can query the software available on the cluster with the following commands:
A list of available software is displayed
Version number and brief synopsis is displayed for
A common misconception for users new to the HPRC Cluster HPRC Cluster Explained
A simple way to run a job on the cluster is to create a shell script containing with embedded PBS Directives containing the information required by the scheduler to schedule the job.
Example: paup witth the ML_analysis.nex sample file
In this example we will run paup with the ML_analysis.nex sample file provided on the paup sample nexus files page. After logging into the cluster download the example file with the command:
-bash-4.1$ wget http://paup.csit.fsu.edu/data/ML_analysis.nex --2014-03-11 13:08:16-- http://paup.csit.fsu.edu/data/ML_analysis.nex Resolving paup.csit.fsu.edu... 126.96.36.199 Connecting to paup.csit.fsu.edu|188.8.131.52|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2990 (2.9K) [text/plain] Saving to: “ML_analysis.nex” 100%[==============================================================>] 2,990 --.-K/s in 0s 2014-03-11 13:08:17 (70.7 MB/s) - “ML_analysis.nex” saved [2990/2990]
Creating the job script
Using a text editor – examples include vim and nano – create your shell script with the filename:
ML_analysis.sh and the following contents (the colours are only used for illistration purposes below):
- The very first line of the script file is the Shebang line and must be on the first line.
The second section contains the PBS directives. For more information on PBS directives please see the HPRC PBS script files page.
- The third section outputs information about the job, and is only included as an example of what can be done.
- The fourth section containf the commands that are actually run in the job. In this case we are using a bash shell.
Submitting the Job - qsub
The final step is to submit the job to the job scheduler:
-bash-4.1$ qsub ML_analysis.sh 148122.jobmgr.hpc.jcu.edu.au
Monitoring the Job - qstat
Once the job has been submitted you can monitor its progress by using the qstat command.
When you first submit your job it is placed into the job queue, and its status column contains
Q, meaning the job is in the queue:
-bash-4.1$ qstat 148122.jobmgr.hpc.jcu.edu.au Job ID Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 148122.jobmgr ML_analysis jcxxxxxxx 0 Q normal
Once your job starts running its status changes to
-bash-4.1$ qstat 148122.jobmgr.hpc.jcu.edu.au Job ID Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 148122.jobmgr ML_analysis jcxxxxx 0 R normal
Deleting a job - qdel
If you need to your job you can use the qdel command
-bash-4.1$ qdel 148122.jobmgr.hpc.jcu.edu.au
Your Jobs Output
Different programs have different ways of outputting their data. If they output data directly to a file then your results will be in whatever file you specified. If, however, the results are printed out to the standard out (as is the case for this example) then PBS captures them into a file for you.
-bash-4.1$ cat ML_analysis.o148122
It is important to match resources requested with the PBS Directives in your script and the resource usage of your job. There can be consequences for incorrectly specifiying these resource requirements
- Walltime: your job can be killed if it exceeds the specified wall time.
- Memory: overusing memory can cause the compute node's memory to be pushed into swap space, slowing down all jobs on that node. This has also killed compute nodes in the past, destroying
- CPUs: using more cpus than requested