Cluster information
The primary computing platform is an instructional cluster (the
Totient cluster) with fifteen Xeon Phi 5110P
boards hosted in eight 12-core compute nodes consisting of
Intel Xeon E5-2620 v3 processors and 32 GB RAM per node.
Node totient-02
has one Phi board; the other compute nodes have two.
The head node is not as powerful as the compute nodes; please
refrain from using it to run computationally intensive jobs.
Unless otherwise stated, homework and projects should be tested and
timed on these machines.
The cluster runs RedHat Enterprise Linux 6.9 (Santiago). The system base compilers are a bit old, though we have installed the Developer Toolset with more recent versions of the compiler as well as some utilities. For most of the work in this class, we recommend using the more recent compilers accessible via the module system.
Intel generously donated the Xeon Phi boards and funded the purchase of the host machines, which were provided with deep matching discounts by Dell.
Accounts and login
In order to obtain access to the cluster, you will need to be enrolled in the class. If you want access as an informal auditor, you will need to contact me (David Bindel) regarding access; please make sure you send your netid in any such email.
Basic login information
The cluster head node is en-cs-totient-01.coecis.cornell.edu
.
It is accessible via SSH. You will need to be on the campus network
(or connected via the
campus VPN) in order to
access this machine. The cluster login information is synchronized
with the campus Active Directory, so your login name is your Cornell
netid and your password is the associated password.
Passwordless login
I recommend setting up SSH key-based authentication so that you don’t have to type your password every time you log in. From your private machine (e.g. a laptop computer), you can generate an SSH key pair; see, for example this tutorial. How to generate an SSH key and register it with an agent varies a bit depending on your operating system and SSH client; my recommendation is
-
If you are on Mac OS X, the native SSH client can use the system keychain directly. To add your private key, simply use the command
ssh-add -K /path/of/private/key
-
If you are on a Linux variant, or if you are using Cygwin under Windows, I recommend using the keychain program. If you are running GNOME, you likely have GNOME Keyring already installed on your system and do not need keychain. If are you running KDE, you likely have KWallet, a similar program. If you do not have these, you can still use keychain.
-
Under Windows, PuTTY seems to be the most common SSH client (unless you are using the SSH client under Cygwin). One set of directions for setting up passwordless ssh is here. I have little personal experience with this, so feel free to put in a better link!
Once you have generated an SSH key, copy the public part of the key to
.ssh/authorized_keys
and change the permission by typing
chmod 600 ~/.ssh/authorized_keys
at the command prompt on the head node.
Then you should be set to go.
Public keys, agent forwarding, and GitHub
You can use SSH public keys for write access to your GitHub repository (and likewise with BitBucket, GitLab, and most other hosted solutions). GitHub has a useful five-step walkthrough for setting this up.
If you want to use a single SSH key on your laptop to also access
your repositories from the class cluster, you may want to
log in with
agent forwarding.
In my case, I have the following lines in my laptop .ssh/config
file
to simplify logging into the cluster with agent forwarding:
Host totient
Hostname en-cs-totient-01.coecis.cornell.edu
User dsb253
ForwardAgent yes
Submitting batch jobs
For simple compiling, editing, and testing, you can work on the
Totient head node. For timing runs, you will want to submit jobs to
the batch queue system. Batch jobs on the Totient cluster are managed by the
TORQUE
resource manager. To run a batch job, you will need to create a PBS
script; for example, the
following script
runs the hostname
command
on a single machine, sending an email to my address if anything went
wrong:
#!/bin/sh
#PBS -N hostname
#PBS -j oe
#PBS -M your_net_id@cornell.edu
hostname
To submit this, I would run qsub hostname.pbs
; and I can see the status
(queued, running, exiting, completed) by running qstat
. On
completion, I would have a file called hostname.o35
(or similar)
with the output of the script.
Batch scripts must be script files, but they don’t have to use the standard Bourne or BASH shells. If you prefer, you can use csh, zsh, or Python! Just change the executable mentioned in the first line. This can be useful when doing things that are more complex than running a single executable, such as running parameter studies.
Modules
High-performance computing installations often use modules to manage the (sometimes conflicting) software requirements for different applications and environments. We will use the Lmod modules system developed at TACC, which has somewhat better support for hierarchical modules than the classic environment module system.
Using environment modules
The basic module commands are:
module avail
: list available modulesmodule list
: list loaded modulesmodule load foo
: load modulefoo
module unload foo
: unload modulefoo
The Totient
module is loaded by default. This provides access to
the most recent version of git
, the vim
editor, the tmux
terminal multiplexer, and a few other things.
Compilers and such
The default system GCC (4.4.7) and associated binutils are ancient. We recommend instead using either the Intel compilers or a more recent version of GCC:
intel/15.0.3
: Intel Parallel Studio XE 2015 (Intel compilers, VTune, etc)gcc/7.2.0
: GCC 7.2.0
Note that loading the Intel module brings in the Devtoolset version of
GCC, so you will want to load the gcc
module after loading the
intel
module if you are going to use both.
Numerical libraries
We’re going to see BLAS and LAPACK early in the semester; these are available from
Once the modules are loaded, you can use pkg-config
to get the appropriate compile and link information, e.g.
# Makefile variable settings for including OpenBLAS
LDFLAGS=`pkg-config --libs openblas`
CFLAGS=`pkg-config --cflags openblas`
Anaconda
Anaconda is a distribution for Python, and also a packaging mechanism for various other software tools. We strongly recommend you load the associated module if you are going to use Python for anything (e.g. the plotter)
anaconda/4.4.0
: Anaconda Python
In addition to giving you a fully-loaded Python distribution, loading
the anaconda
module will give you access to the CLang compiler and
the Julia programming environment.