Child pages
  • HPRC Cluster Explained
Skip to end of metadata
Go to start of metadata

Using HPRC services can be visualized (user perspective) as:

 HPC Interactive Jobs

ssh is the protocol/command you will need to use HPRC cluster compute resources. Unless you have specific reasons for choosing an individual node, you should connect to - over time, names of the systems behind zodiac will almost definitely change.

The move away from a single physical login node has been made in an attempt to improve HPRC login service availability. In the past, there have been multiple login node outages - mostly caused by over-consumption of memory. In cases where an outage occurred outside HPRC staff hours, this would lead to loss of service that lasts up to 2.5 days (worst case). In the new configuration, loss of a login node simply means that active sessions on that node are disrupted and affected users simply need to reconnect to to continue.

Login Nodes Etiquette:
HPRC staff have tried to setup login nodes that are easy and convenient. The cost of this is that login node service availability is dependent on user behaviour. The following login node usage tips should be known to all login node users.

  • If you want to find out how much resource your job is using, use top.
  • Use the who command to determine who might be sharing a login node with you.
  • Avoid, if possible, running long jobs (>24 hours of CPU time) on login nodes.
  • Avoid, if possible, running big memory jobs (>8GB) on login nodes.
  • Only run multiprocessor jobs on login nodes during testing of successful parallel execution. For people using 3rd party software, one successful job completion marks the end of your software testing phase. At this point, you need to submit your jobs to the compute cluster (using qsub).
  • Don't run multiple jobs in parallel on login nodes, unless the jobs consume no more than a few minutes of CPU time.
 HPC Cluster Jobs

To run jobs on the HPRC compute cluster, you will first need to login (using ssh) to

Running jobs on compute or big memory nodes is done using the qsub command. Avoid directly requesting big memory nodes unless your job absolutely requires that resource. Researchers should only be directly requesting big memory node resources if at least one of the following conditions is met:

  • Your job consumes more than 24 CPU cores.
  • Your job consumes more than 60GB of memory.
  • Your job requires the extreme IOps performance of SSD.

HPRC staff may kill jobs that request big memory nodes but don't meet the above conditions when genuine contenders (jobs) for this resource are sitting in the queue.

 Virtual Machines

The line labels identify commands or protocols involved:

  • ssh is the protocol/command you will need to access Linux VMs (command line).
  • rdp stands for Remote Desktop Protocol.

Access to virtual resources is something that researchers need to request.
For most services, a form will need to be completed and approved by ITR management.
Virtual resources will only be provisioned for computational requirements if the HPRC cluster cannot accommodate your needs - e.g., your software doesn't work under RedHat Enterprise Linux 6.
HPRC staff offer a limited resource for researchers (no forms). Due to the hardware involved, no guarantees are provided for service availability on this limited resource. Failure of hardware in this environment could lead to permanent cessation of the service.
Researchers requesting such services are responsible for all licensing costs (e.g., Microsoft Windows software) associated with service provisioning.

 Storage Consumption

The graph below shows amount of data held by JCU HPRC users (those with more than 1GB of data) on /home in May-2013.

HSM (Hierarchical Storage Management) configuration means that very small files reside on disk only (1 copy), old files are held on tape (2 copies), while all other files will have 1 copy on disk and 2 copies of tape. The actual storage consumed is at least twice what is shown in the graph, not including backups.

The top storage consumer has generated over 48TB of files, consuming more than 100TB of disk+tape media.

 File Permissions

All HPRC accounts are created with group and world readable permissions. If you have concerns about people looking at the research files you hold, it is up to you to change the permissions. If you want to take the risk of enabling group writable or world writable (the latter is definitely not advisable) then you are free to make the required changes. HPRC staff can be contacted for support if you don't know how to do this, although in the case of world writable HPRC staff will not make the change.

CIFS fileshares (\\\...) are provided as a convenience for people on Windows systems. At present this protocol mostly works for people with OSX systems. Recently an unexpected issue with OSX clients has been identified to HPRC staff. HPRC staff do not have facilities to test the proposed fix to this problem - ensuring that it doesn't affect the majority of users and hence such a change cannot be approved.

Information about HPRC Desktop Software is available to assist with establishing the connection(s) you may require.

Links to Similar Content

HPRC Storage Explained (v0.1)
HPRC Services Availability (v1.0)
HPRC Performance Considerations (v0.1)
HPRC Cluster Job Management Explained (v0.1)
HPRC Acronyms (v0.1)

  • No labels