Introduction to High-Performance Computing: Glossary

Key Points

Why use High Performance Computing?
  • High Performance Computing (HPC) is a tool to calculate faster or larger than is possible on your own system.

  • HPC relies on using parallelism to provide performance improvements.

  • HPC typically involves connecting to very large computing systems elsewhere in the world.

What is an HPC system?
  • A high-performance computer system provides a larger compute capability than is possible to package in a personal computer.

  • HPC systems are typically an aggregation of a bunch computers, each one of which can look pretty similar to your personal computer.

  • HPC systems are usually accessed remotely, over the network.

  • HPC systems are usually shared among many users. Each user typically gets a dedicated portion of the computer’s resources for a period of time.

  • Special measures have to be taken to provide a file system that can keep up with an HPC system.

  • HPC systems often provide a lot of different software packages, and provide ways of selecting and configuring them to get the environment you need.

Connecting to the HPC system
  • To connect to a remote HPC system using SSH: ssh yourUsername@remote.computer.address

Transferring files
  • wget downloads a file from the internet.

  • sftp/scp transfer files to and from your computer.

  • You can use an SFTP client like FileZilla to transfer files through a GUI.

Scheduling jobs
  • The scheduler handles how compute resources are shared between users.

  • Everything you do should be run through the scheduler.

  • A job is a lot more than just a shell script! It is workflow for all your computational work. This is what your share with collaborators and documents what you did to produce your computational results that are making up your research outcomes and allows for others to easily reproduce and validate your research.

  • If in doubt, request more resources than you will need.

Accessing software
  • Load software with module load softwareName

  • The module system handles software versioning and package conflicts for you automatically.

  • Loading modules must also be done in scheduler job scripts.

Using resources effectively
  • The better your resource request estimates, the better throughput you will see.

Using shared resources responsibly
  • Be careful how you use the login node.

  • Your data on the system is your responsibility.

  • Plan and test large data transfers.

  • It is often best to convert many files to a single archive file before trasferring.

How does parallel computing work
  • There are different parallel strategies available to get performance on HPC systems.

  • The approach you use depends on the software you are using and your research problem.

  • Most HPC systems allow you to use different parallel strategies in combination.

Understanding what resources to use
  • Basic benchmarking allows you to use HPC resources more effecively.

  • Performance is generally more useful metric than runtime or speedup.

  • A small number of measurements can make a large difference.

Future of HPC
  • Future HPC architectures will have more powerful compute nodes through more cores and/or the use of accelerators.

  • Novel memory and I/O technologies will provide new ways to use HPC systems.

Bootstrapping your use of HPC
  • Understand the next steps for you in using HPC.

  • Understand how you can access help and support to use HPC.

Glossary

FIXME