Basic setup

The cluster is a Linux-based Rocks cluster for exclusive use by the class. There are six nodes including the head node. Each node consists of two quad-core Intel Xeon E5504 chips for a total of eight cores per node. Jobs are submitted using the Sun Grid Engine (SGE). Each node will only run one job at a time, so be polite and don't hog the machine with very long runs!

Logging in

The cluster head node is crocus.csuglab.cornell.edu. It should be accessible from any on-campus IP address (or you can reach it if you're logged into the campus VPN. Log into crocus using ssh with the account name and password that was sent to you via Dropbox (your account name is your netid). On the first login, you will be prompted for passwords for ssh keys; just hit enter, as these keys are purely used for private communication inside the cluster. Once you've done this, you should be delivered to a Unix prompt. Welcome to the head node!

After your initial login, you should probably change your password to something that you can remember. Alternately, you may want to set up password-less ssh authentication between your machine and the cluster. The details will depend on which ssh client you use. On Linux, I suggest looking into keychain. On OS X 10.5 onward, ssh-agent runs for you automatically, so you can simply add keys at the command line using ssh-add and then not worry about it. Under Windows, you may want to look into PuTTY, which apparently has support for ssh keys; see this tutorial, for example, keeping in mind as you read it that I don't typically use Windows myself.

Directory setup

You have access to two types of storage on the cluster. Your home directory is hosted on the head node, and is mounted by NFS on all the other nodes. You can read or write to your home directory files in one place and see them in other places (eventually). However, NFS uses a lot of bandwidth, and it is easy to swamp the server. For big files, use /state/partition1, a user-accessible local partion that exists on each node (on the head node, this is where the home directories are, but the head node is a special case).

Software is provided in the usual locations (/usr/bin and /usr/local/bin), but there are also common installations in /share/apps/local. In particular, this is where GCC 4.4 (and gfortran) and ATLAS are installed. This is not in the default path, so you will either need to edit your path or type the fully-qualified command names to use these compilers.

Hardware

Compute nodes on the cluster are dual quad-core Intel Xeon E5405 chips running at 2.0 GHz. This is the "Gainestown" family fabricated in the 45 nm process, based on the Nehalem architecture. For more details on the processor type, try cat /proc/cpuinfo (followed by Googling!). There are some slides on this architecture from another class that you can read to find out more.

The nominal peak per node is 8 GFlop/s, if one starts two SSE instructions per cycle (each of which can handle two double-precision floating point operations). There are 16 GB of physical RAM per node. Each core has a 4-way associative 32 KB L1 cache and an unshared 8-way 256KB L2 cache. There is also a shared (within a processor) 16-way 4 MB L3 cache.

Queueing

The command to submit jobs to the queue is qsub; try qsub -help or man qsub at the command line to see the basic documentation. Running qsub scriptname will schedule scriptname to be run on one of the compute nodes. scriptname is usually a shell script; in addition to the normal shell operations, one can use comment lines starting with #$ to set execution options (these options can also be set via the command line).

Some good options to know are:

Use qstat to see the current status of your jobs, and use qdel to delete jobs (particularly if they appear to be runaway!).