Supercomputing
Overview
What isn’t a supercomputer
A super fast computer that just runs your program faster.
What is a supercomputer
Nibi (latest SHARCNET supercomputer)
752 computers (nodes)
143,808 cores
657 petabytes (PB) of RAM (mostly 4 gigabyte (GB)/core)
25 petabytes (PB) of SSD flash
312 NVIDIA GPUs (H100 SXM)
24 AMD GPUs (CDNA 3)
400 Gbit/s Ethernet between NVIDIA GPU nodes
200 Gbit/s Ethernet between CPU and AMD GPU nodes
How do you use a supercomputer
- serial
(easy) many computers means you can run many programs indepedently at the same to solve many independent problems
- parallel
(hard) many computers can (sometimes) be programmed to all collaborate and solve a single problem together
- cloud
virtual computer on the internet (like Amazon’s Elastic Compute Cloud, Microsoft’s Azure cloud, or Google’s Compute Engine)
Canada
The players
National
Digital Research Alliannce of Canada
Regional
Compute Ontario, Calcul Québec, ACENET, BC DRI Group, and Prairies DRI
Ontario
Center for Advanced Computing (CAC), HPC4Health, SciNet, and SHARCNET
Getting an accounts
Cost
no cost
Who
faculty or those sponsored by faculty that have an account
How
apply for a Digital Research Alliance account
pick the desired regional consortia accounts
What software is available
Operating system
Linux (CentOS 7/Rocky 8/Alma 9)
Programming languages
C/C++, Fortran, MATLAB/Octave, Python, R, Julia, etc.
Parallel development support
pthreads, MPI, OpenMP, CUDA, OpenACC, OpenCL
Other
common open source and commerical packages (e.g., OpenFOAM, Fluent, and STAR-CCM+)
How to use
resources are scheduled to avoid collisions and ensure fair access
access from anywhere on the internet using secure shell (SSH) to enter commands and a secure file transfer (SFTP) to transfer files
tell the supercomputer what progam (command) you want it to run and then do something else till it does it
Typical workflow
transfer your data and/or programs to the supercomputer using SFTP
login (bring up a window in which you can enter commands) to the supercomputer using SSH
enter commands to tell the scheduler what you want it to run when the required resources are avaialable
do something else until you get notification that your commands have completed running
transfer the resulting data from the supercomputer to your computer