Using

Required Software

Clusters are accessed via the secure shell programs

ssh

used to run commands (secure shell)

scp

used to copy files (secure copy)

sftp

alternative to copy files (secure file transfer protocol)

from anywhere on the internet using the address CLUSTER.alliancecan.ca (capitalized terms indicate you should substitute something appropriate).

Windows

Most common options are the GUI

  • mobaxterm is a good freeware GUI choice that provides everything in one package including a graphical sftp, X11 server, and local shell

  • PuTTY is a basic open-source ssh-only option

Windows 10 has native versions of the command-line tools too

MacOS X and Linux

Command-line versions are builtin and available in the shell

  • search for and start a terminal application

Accessing

Logging in

ssh USER@CLUSTER.alliancecan.ca

login to CLUSTER as USER and start a remote shell session

  • for GUI ones (e.g., mobaxterm, putty, etc.), enter USER into username field, CLUSTER.alliancecan.ca into host name field, and connect

Copying files

scp USER@CLUSTER.alliancecan.ca:REMOTE LOCAL

login to CLUSTER as USER and copy REMOTE file to LOCAL file

scp LOCAL USER@CLUSTER.alliancecan.ca:REMOTE

login to CLUSTER as USER and copy LOCAL file to REMOTE file

  • For GUI ones (e.g., mobaxterm, etc.) enter USER and CLUSTER, connect, and then drag and drop files to transfer

Shell

Commands, entered interactively or ran from a file, tell computer what to do

  • a bit of an initial learning curve, but

  • much easier to automate, document, and share

Basic command

PROGRAM ARGUMENTS...
  • arguments usually options (switches) followed by strings (e.g., file names, etc.)

  • short options are a single dashes followed by letter for each switch (e.g., rm -fr mydir)

  • long options are a double dash followed by a descriptive string (e.g., rm --force --recurse mydir)

Help

help COMMAND

show help for builtin command (like “cd”)

man COMMAND

show the manual page for command (‘q’ to quit)

Getting around

ls

list files

pwd

what folders you are in

cd DIRECTORY

change to directory named directory (use “..” for previous one)

Copying/renaming/moving/removing files and directories

cp SRC DEST

copy file/directory (add -r for directory)

mv SRC DEST

move/rename file/directory

rm PATH

remove file (add -r for directory)

Editing files

nano PATH

edit file named file

Digital Research Alliance of Canada software (scientific packages)

module load CcEnv StdEnv

enable the software stack on VDI systems (Cc for Compute Canada for historical reasons)

module list

show what software is load environment

module avail

show what software you can load

module spider PACKAGE

search for package

module load PACKAGE

load package

module unload PACKAGE

unload the package

Software via Nix (utilities)

module load nix

enable personal Nix software environment

nix search PACKAGE

search for PACKAGE

nix run ATTRIBUTES ...

start sub-shell with ATTRIBUTES available`

nix-env -q

list software installed in personal environment

nix-env -iA ATTRIBUTE

install ATTRIBUTE into personal environment

nix-env -e NAME

remove NAME from personal environment

nix-env --rollback

undo change to personal environment

Running/submitting, viewing, and killing jobs

salloc

start interactive job

sbatch SCRIPT

queue batch job

squeue

list jobs

skill JOBID

kill job

Filesystems

Storage Folder

Type

Quota (size)

(files)

Surge (size)

(files)

(days)

/home/USER

User

50GB

500K

70GB

700K

60

/scratch/USER

User

1,000GB

1,000K

20,000GB

1,000K

60

/project/PROJECT

Group

1,000GB

500K

NA

NA

NA

  • quick scratch, project (default), and projects (all) links under home

Jobs

Check documentation page for desired software on docs.alliancecan.ca

  • program specifc instructions

  • licensing details for commerical software

All significant computations (> few minutes) run as SLURM jobs

  • only way to access the 1,000+ compute nodes

  • initial interative type-in-commands test runs with salloc

  • final non-interactive put-commands-in-file runs with sbatch

Batch file for sbatch have the general format

##!/bin/bash
##SBATCH OPTION1
##SBATCH OPTION2
...

COMMANDS TO RUN
...

Job options

Specify the resources and accounting required to run a job

  • as options to salloc and sbatch commands, and/or

  • in start of job files using special #SBATCH ... lines

Basic options

--account def-SPONSOR

account to allocate usage to

--time D-HH:MM:SS

maximum runtime before killing (add a bit of buffer)

--mem-per-cpu N

memory per CPU (suffix with K, M, G, or T for units)

Multi-threaded jobs

  • limited to maximum number of CPUs in one computer (generally 32)

  • may need export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK in job script

--cpus-per-task N

number of threads per process (threaded)

Multi-processor jobs (MPI)

  • need to start the program with the mpirun or srun commands

--ntasks

number of processes (MPI)

GPU jobs

--gpus-per-task TYPE:N

number of GPUs per process

Breakpoints

partition_stats

show paritioning of compute nodes eligible for jobs

Exceeding these resources decreases the number of machines that will run a job

  • runtime > 3, 12, 24, 72, 168, and 672hrs

  • memory/cpu > 12MB