Connecting to the remote HPC system


To connect to a remote HPC system using SSH, we would run the following command and enter an SSH key passphrase and machine password. (For ARCHER2, this is a TOTP)

BASH

ssh -i ~/.ssh/key_for_remote_computer yourUsername@remote.computer.address

Why do we use HPC?


  • High Performance Computing (HPC) typically involves connecting to very large computing systems located elsewhere in the world.
  • These systems can perform tasks that would be impossible or much slower on smaller, personal computers.
  • We already rely on remote servers every day.

Working on a remote HPC system


  • HPC systems are large, fixed-location clusters designed for computationally intensive tasks, unlike cloud systems which are flexible and distributed.
  • HPC systems typically provide login nodes and a set of worker nodes.
  • The resources found on independent (worker) nodes can vary in volume and type (amount of RAM, processor architecture, availability of network mounted filesystems, etc.).
  • Files and environments are often shared across nodes, meaning users can access their data and run jobs anywhere within the cluster.

Working with the scheduler


  • Schedulers manage fairness and efficiency on HPC systems, deciding which user jobs run and when.
  • A job is any command or script submitted for execution.
  • The scheduler handles how compute resources are shared between users.
  • Jobs should not run on login nodes — they must be submitted to the scheduler.
  • MPI jobs require special launch commands (srun, mpirun, etc.) and explicit process counts to utilize multiple cores or nodes effectively.

Accessing software via Modules


  • HPC systems use modules to help deal with software incompatibilities, versioning and dependencies
  • We can see what modules we currently have loaded with module list
  • We can see what modules are available with module avail
  • We can load a module with module load softwareName.
  • We can unload a module with module unload softwareName.
  • We can swap modules for different versions with module swap old-softwareName new-softwareName.

Transferring files with remote computers


  • It is an essential skill to be able to transfer files to and from a cluser
  • wget and curl -O can be used to download a file from the internet.
  • scp transfers files to and from your computer.
  • If you have a lot of data to transfer, it is good practice to archive and compress the data

Using resources effectively


  • Benchmarking is an essential practice for understanding your workload and using resources efficiently
  • Efficient usage is not just about getting the time-to-solution as low as possible

Using shared resources responsibly


  • Login nodes are a shared resource - be a good citizen!
  • Your data on the system is your responsibility.
  • Plan and test your large-scale work to prevent inefficient use of resources
  • It is often best to convert many files to a single archive file before transferring.
  • Again, don’t run stuff on the login node.