Child pages
  • HPRC Cluster: Quick Start User Guide
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

This page is intended as a quick introduction for new users submitting their first job to the HPRC Cluster. A few things new users should be aware of:

  • The software is not run in a window on their desktop, neither is it launched by clicking on it in the network drive.
  • Typically, jobs are not run in an interactive manner.
  • They need to log into the cluster and inform the job scheduler about their job and it will run it when it can.

 

Logging In

The first step in using the HPRC Cluster is to log in to the login node - zodiac.hpc.jcu.edu.au.

 Logging into the Cluster

Error rendering macro 'excerpt-include'

No link could be created for 'Public:HPRC Desktop Software - Logging into the Cluster'.

Software Packages

The HPRC Cluster uses environment modules to manage the available software packages. This allows multiple versions of the same software to be installed without interfearing with each other. To enable the environment module systemthe following command needs to be executed on the command line:

-bash-4.1$ source /etc/profile.d/modules.sh

 

The software that is available on the HPRC clusted is listed here: HPRC User Software.  Alternately you can query the software available on the cluster with the following commands:

Command

Result

module avail

A list of available software is displayed

module help <software>

Version number and brief synopsis is displayed for <software>

 Example "module avail" run on the Thu Mar 6 11:19:21 EST 2014
-bash-4.1$ module avail
--------------------------------------------------------------------------------------- /usr/share/Modules/modulefiles ----------------------------------------------------
MPInside/3.5.1     compiler/gcc-4.4.5 module-cvs         modules            mpich2-x86_64      null               perfcatcher
chkfeature         dot                module-info        mpi/intel-4.0      mpt/2.05           perfboost          use.own
---------------------------------------------------------------------------------------------- /etc/modulefiles -----------------------------------------------------------
compat-openmpi-x86_64 openmpi-x86_64
------------------------------------------------------------------------------------------------- /sw/modules -------------------------------------------------------------
4ti2                      blast/2.2.23              crimap_Monsanto           hdf5                      migrate/3.6(default)      picard-tools              tmap/1.1
BEDTools                  blast/2.2.29(default)     dx                        hmmer                     mira                      proj                      tmhmm
EMBOSS                    bowtie                    elph                      ima2                      modeltest                 pvm                       topali
GMT                       bwa/0.7.4(default)        enmtools                  jags                      molphy                    r8s                       towhee
Macaulay2                 caftools                  fasta                     java                      mpich2                    rainbowcrack              towhee-openmpi
Python/2.7                cap3                      fastme                    jcusmart                  mrbayes                   rpfits                    trans-abyss
R/2.15.1(default)         carthagene/1.2.2(default) ffmpeg                    jmodeltest                mrmodeltest               ruby/1.9.3                tree-puzzle
R/3.0.0                   carthagene/1.3.beta       fftw2                     lagan                     msbayes                   ruby/2.0.0                trinityrnaseq
abyss                     casacore                  fftw3                     lamarc                    ncar                      samtools                  udunits
ariadne                   cernlib                   garli                     lapack                    netcdf                    scalapack                 udunits2
arlequin                  cfitsio                   gdal                      libyaml/0.1.4             netphos                   scipy                     velvet
asap                      chlorop                   glimmer                   matlab/2008b              numpy                     seadas/6.2                wcslib
atlas                     clipper                   glpk                      matlab/2012a              oases                     seg                       wise2
bayesass                  clustalw                  gmp                       matlab/2012b              octave                    signalp                   wwatch3
beagle                    cluster                   gnu/4.1.2                 matlab/2013a(default)     openbugs                  sprng                     yasm
beast                     cns                       gnu/4.4.0                 maxent                    openjdk                   ssaha2                    zonation
beast-1.5.4               coils                     gnuplot                   maxima                    openmpi                   stacks
bfast                     colony2                   grass                     merlin                    pari                      structure
blacs                     consel                    gromacs                   migrate/3.2.15            paup                      targetp
blas                      crimap                    hdf                       migrate/3.5.1             phyml                     tclreadline/2.1.0

 

Running Jobs

A common misconception for users new to the HPRC Cluster  HPRC Cluster Explained

A simple way to run a job on the cluster is to create a shell script containing with embedded PBS Directives containing the information required by the scheduler to schedule the job.

Example: paup witth the ML_analysis.nex sample file

In this example we will run paup with the ML_analysis.nex sample file provided on the paup sample nexus files page. After logging into the cluster download the example file with the command:

-bash-4.1$ wget http://paup.csit.fsu.edu/data/ML_analysis.nex
--2014-03-11 13:08:16--  http://paup.csit.fsu.edu/data/ML_analysis.nex
Resolving paup.csit.fsu.edu... 144.174.50.3
Connecting to paup.csit.fsu.edu|144.174.50.3|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2990 (2.9K) [text/plain]
Saving to: “ML_analysis.nex”

100%[=====================================================================================================================================================================>] 2,990       --.-K/s   in 0s

2014-03-11 13:08:17 (70.7 MB/s) - “ML_analysis.nex” saved [2990/2990]

 

Creating the job script

Using a text editor – examples include vim and nano – create your shell script with the filename: ML_analysis.sh and the following contents (the colours are only used for illistration purposes below):

#!/bin/bash

#PBS -c s
#PBS -j oe
#PBS -m ae
#PBS -N ML_analysis
#PBS -l pmem=5gb
#PBS -l walltime=500:00:00
#PBS -M your.name@jcu.edu.au

ncpu=`wc -l $PBS_NODEFILE | awk '{print $1}'`
echo "------------------------------------------------------"
echo " This job is allocated "$ncpu" CPU cores on "
cat $PBS_NODEFILE | uniq
echo "------------------------------------------------------"
echo "PBS: Submitted to $PBS_QUEUE@$PBS_O_HOST"
echo "PBS: Working directory is $PBS_O_WORKDIR"
echo "PBS: Job identifier is $PBS_JOBID"
echo "PBS: Job name is $PBS_JOBNAME"
echo "------------------------------------------------------"
 
cd $PBS_O_WORKDIR
source /etc/profile.d/modules.sh
module load paup
paup -n ML_analysis.nex

Legand:

  1. The very first line of the script file is the Shebang line and must be on the first line.
  2. The second section contains the PBS directives. For more information on PBS directives please see the HPRC PBS script files page.
  3. The third section outputs information about the job, and is only included as an example of what can be done.
  4. The fourth section containf the commands that are actually run in the job. In this case we are using a bash shell.

 

Submitting the Job - qsub

The final step is to submit the job to the job scheduler:

-bash-4.1$ qsub ML_analysis.sh
148122.jobmgr.hpc.jcu.edu.au

Monitoring the Job - qstat

Once the job has been submitted you can monitor its progress by using the qstat command.

When you first submit your job it is placed into the job queue, and its status column contains Q, meaning the job is in the queue:

-bash-4.1$ qstat 148122.jobmgr.hpc.jcu.edu.au
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
148122.jobmgr              ML_analysis      jcxxxxxxx             0 Q normal

 

Once your job starts running its status changes to R:

-bash-4.1$ qstat 148122.jobmgr.hpc.jcu.edu.au
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
148122.jobmgr              ML_analysis      jcxxxxx               0 R normal

Deleting a job - qdel

If you need to your job you can use the qdel command

-bash-4.1$ qdel 148122.jobmgr.hpc.jcu.edu.au

Your Jobs Output

Different programs have different ways of outputting their data. If they output data directly to a file then your results will be in whatever file you specified. If, however, the results are printed out to the standard out (as is the case for this example) then PBS captures them into a file for you.

-bash-4.1$ cat  ML_analysis.o148122
 Click here to expand...

------------------------------------------------------
 This job is allocated 1 CPU cores on
n025nfs
------------------------------------------------------
PBS: Submitted to normal@n029.default.domain
PBS: Working directory is /home/jcxxxxx/paup
PBS: Job identifier is 148122.jobmgr.hpc.jcu.edu.au
PBS: Job name is ML_analysis_example
------------------------------------------------------

P A U P *
Portable version 4.0b10 for Unix
Tue Mar 11 13:36:52 2014

      -----------------------------NOTICE-----------------------------
        This is a beta-test version.  Please report any crashes,
        apparent calculation errors, or other anomalous results.
        There are no restrictions on publication of results obtained
        with this version, but you should check the WWW site
        frequently for bug announcements and/or updated versions.
        See the README file on the distribution media for details.
      ----------------------------------------------------------------

Processing of file "~/ML_analysis.nex" begins...

Data read in DNA format

Data matrix has 8 taxa, 200 characters
Valid character-state symbols: ACGT
Missing data identified by '?'
"Equate" macros in effect:
   R,r ==> {AG}
   Y,y ==> {CT}
   M,m ==> {AC}
   K,k ==> {GT}
   S,s ==> {CG}
   W,w ==> {AT}
   H,h ==> {ACT}
   B,b ==> {CGT}
   V,v ==> {ACG}
   D,d ==> {AGT}
   N,n ==> {ACGT}

Neighbor-joining search settings:
  Ties (if encountered) will be broken systematically
  Distance measure = uncorrected ("p")
  (Tree is unrooted)

   Tree found by neighbor-joining method stored in tree buffer
   Time used = <1 sec (CPU time = 0.00 sec)

Neighbor-joining tree:

/--------------------------------------------- A
|
+-------------------------------------------- B
|
|               /----------------------------------------------- C
|               |
|               |         /------------------------------------------------- D
|               |         |
\---------------+      /--+    /--------------------------------------------- G
                |      |  \----+
                |      |       \------------------------------------------ H
                \------+
                       |       /------------------------------------------ E
                       \-------+
                               \----------------------------------------- F

Likelihood scores of tree(s) in memory:
  Likelihood settings:
    Number of substitution types  = 2 (HKY85 variant)
    Transition/transversion ratio estimated via ML
    Assumed nucleotide frequencies (empirical frequencies):
      A=0.35000  C=0.28813  G=0.20563  T=0.15625
    Among-site rate variation:
      Assumed proportion of invariable sites  = none
      Distribution of rates at variable sites = gamma (discrete approximation)
        Shape parameter (alpha)   = estimated
        Number of rate categories = 4
        Representation of average rate for each category = mean
    These settings correspond to the HKY85+G model
    Number of distinct data patterns under this model = 152
    Molecular clock not enforced
    Starting branch lengths obtained using Rogers-Swofford approximation method
    Branch-length optimization = one-dimensional Newton-Raphson with pass
                                 limit=20, delta=1e-06
    -ln L (unconstrained) = 936.27218

Tree                   1
------------------------
-ln L         1646.41982
Ti/tv:
  exp. ratio    4.167819
  kappa         8.796257
Shape           0.429541

Time used to compute likelihoods = 1 sec (CPU time = 0.79 sec)

Optimality criterion set to likelihood.

Heuristic search settings:
  Optimality criterion = likelihood
    Likelihood settings:
      Number of substitution types  = 2 (HKY85 variant)
      Transition/transversion ratio = 4.16782 (kappa = 8.7962568)
      Assumed nucleotide frequencies (empirical frequencies):
        A=0.35000  C=0.28813  G=0.20563  T=0.15625
      Among-site rate variation:
        Assumed proportion of invariable sites  = none
        Distribution of rates at variable sites = gamma (discrete
                                                  approximation)
          Shape parameter (alpha)   = 0.429541
          Number of rate categories = 4
          Representation of average rate for each category = mean
      These settings correspond to the HKY85+G model
      Number of distinct data patterns under this model = 152
      Molecular clock not enforced
      Starting branch lengths obtained using Rogers-Swofford approximation
        method
      Trees with approximate likelihoods 5% or further from the target score
        are rejected without additional iteration
      Branch-length optimization = one-dimensional Newton-Raphson with pass
                                   limit=20, delta=1e-06
      -ln L (unconstrained) = 936.27218
  Starting tree(s) obtained via stepwise addition
  Addition sequence: random
    Number of replicates = 5
    Starting seed = 1412047148
  Number of trees held at each step during stepwise addition = 1
  Branch-swapping algorithm: tree-bisection-reconnection (TBR)
  Steepest descent option not in effect
  Initial 'MaxTrees' setting = 100
  Branches collapsed (creating polytomies) if branch length is less than or
     equal to 1e-08
  'MulTrees' option in effect
  Topological constraints not enforced
  Trees are unrooted

Heuristic search completed
   Total number of rearrangements tried = 128
   Score of best tree(s) found = 1645.76314
   Number of trees retained = 1
   Time used = 4 sec (CPU time = 3.49 sec)

Tree-island profile:
                     First      Last                     First   Times
Island      Size      tree      tree        Score    replicate     hit
----------------------------------------------------------------------
     1         1         1         1   1645.76314            1       5

Processing of file "~/ML_analysis.nex" completed.

Job Resources

It is important to match resources requested with the PBS Directives in your script and the resource usage of your job.

 

Appendix

  • No labels