Hardware

Below is a summary of the Planck cluster hardware. Interprocessor communication is over a Mellanox 4X Infiniband connection. There are three 24-port IB switches, each with 6 connections to each of the other switches and 11 connections to compute nodes. The I/O subsystem consists of a separate IB network between the nodes. This IB network also has three 24-port IB switches, and is additionally connected to 4 gateway nodes which have fiber channel connections to the NERSC Global Filesystem (NGF).

Planck Cluster Hardware Specifications
Number of compute nodes 32
Processor cores per node 8
Number of compute processor cores 256
Processor Core type Opteron 2350 2.0GHz Quad Core
Physical memory per compute node 32 GB
Number of login nodes 1
Communication Interconnect Mellanox 4x Infiniband
File System Separate 4x Infiniband network connected to NGF through 4 gateway nodes
Batch system Torque/Maui

User Environment

Your default shell on planck.nersc.gov is controlled just like other NERSC machines. You can log in to NIM and change the default shell. Note that the machine is called "PDSF" in the NIM interface. See this page for more instructions. The files .bash_profile, .bashrc, .cshrc, .kshenv, .login, .profile, and .tcshrc are links to read-only files, and should not be deleted. All individual customizations (aliases, environment variables, etc.) should be made in the files named .bashrc.ext, .cshrc.ext, .kshenv.ext, .login.ext, .profile.ext, and .tcshrc.ext. These .ext files are sourced by the corresponding dot-files.

Each user has a relatively small home directory. The primary disk space for all software development and data processing is the NERSC Global Filesystem (NGF), which is mounted in the usual place at /project/projectdirs. The home directories and /project are visible from all the compute nodes.

Software

Using and running software on planck.nersc.gov is very similar to other NERSC machines. Here we try to summarize the differences. The Planck cluster at NERSC has several different "programming environments", which consist of different serial and MPI compiler toolchains. The Pathscale environment uses the Pathscale serial compilers, a version of MVAPICH2 built on top of these compilers, and a pathscale-specific version of ACML for accelerated math, BLAS/LaPACK, and FFT functionality. The GNU environment uses gcc (4.3.x) serial compilers and compatible versions of MVAPICH2 and ACML. Before loading the cmb module, you must first decide which programming environment you want to use. The default is the GNU environment:

%> module avail PrgEnv
  PrgEnv/gnu-1.0(default) PrgEnv/pathscale-1.0

%> module load PrgEnv
The Pathscale programming environment is fairly minimal, and should only be used by people wishing to test specific things. In our testing, the GNU compilers actually create faster-running code on this platform. The GNU environment also provides many other useful software packages not found in the pathscale environment. These tools include: After loading the desired programming environment, you can simply load the cmb module:
%> module load cmb

This will load the cmb module set that has been compiled using the previously loaded programming environment.

Using the PBS Scheduler

When working on the compute nodes (both interactively and with batch jobs) there are a number of options that control the environment in which your applications run. You should actively think about what software you are running, what you are trying to do with the software, how much memory you need, etc. Then you can tailor your environment to the requirements of the task at hand.

Common PBS Options/Directives
OptionDefaultDescription
-l nodes=N:ppn=P,pvmem=Mgb nodes=1:ppn=1,pvmem=4gb Use P processors per node across N nodes with M Gigabytes of Memory per processor.
Your job will die if you request more than a total of 32GB per node. Valid entries for
this option would be
nodes=X:ppn=8,pvmem=4gb
nodes=X:ppn=4,pvmem=8gb
nodes=X:ppn=2,pvmem=16gb
nodes=X:ppn=1,pvmem=32gb
-l walltime=HH:MM:SS Maximum for Queue (see table) Limit the job wall clock time to HH hours, MM minutes, and SS seconds.
-e filename <script_name>.e<job_id> Write STDERR to filename
-o filename <script_name>.o<job_id> Write STDOUT to filename
-j [eo|oe] Do not merge. Merge STDOUT and STDERR. If eo merge as standard error; if oe merge as standard output.
-m [a|b|e|n] n E-mail notification options:
a = send mail when job aborted by system
b = send mail when job begins
e = send mail when job ends
n = do not send mail
Options a,b,e may be combined.
-N job_name Job script name. Job Name: up to 15 printable, non-whitespace characters.
-q queue batch See Batch queues below.
-S shell Login shell Specify shell as the scripting language to use.
-V Do not import. Export the current environment variables into the batch job enviroment.

Interactive Work

You should ALWAYS use the compute nodes for doing any tasks which are cpu or memory intensive. The login node should only be used for light work such as editting files, compiling software, etc. To use the compute nodes interactively (i.e. launch a shell on those nodes), you use the "-I" option to qsub.

Example

To run IDL (which is a serial program) on one of the compute nodes and use all 32GB of memory, one would do:

%> module load cmb
%> qsub -I -V -l nodes=1:ppn=1,pvmem=32gb
   (note the "-V" option to propogate the cmb module environment
    to my new shell on the compute node)
%> cd $PBS_O_WORKDIR
   (change back to the directory where I launched qsub)
%> idl

If the qsub command above is something that you use frequently and is annoying to type, I suggest creating a shell alias for that command.

Example

To run kst (serial program) on one of the compute nodes and use all 32GB of memory, one would do:

%> module load cmb kst
%> qsub -I -V -l nodes=1:ppn=1,pvmem=32gb
   (note the "-V" option to propogate the cmb module environment
    to my new shell on the compute node)
%> cd $PBS_O_WORKDIR
   (change back to the directory where I launched qsub)
%> kstclean
   (the kstclean alias pipes garbage errors to /dev/null)

Batch Jobs

Running jobs on planck.nersc.gov is very similar to running jobs on jacquard.nersc.gov. The only difference is that planck has 8 processor cores per node, and 32 available nodes. In both the GNU and Pathscale compiler environments, you should use the "mpiexec" command in your PBS script to launch jobs. You can also use the "-V" PBS keyword to propogate your shell environment (including which modules are loaded) to the compute nodes. You can submit batch jobs to one of the three available queues (interactive, debug, batch), and it will run with the priority and limits in the table below.

Submit
Queue
Exec
Queue
Nodes Max Wallclock Max Jobs per user Relative Priority
interactive interactive 1-4 1 hour Currently Unlimited 1
debug debug 1-8 1 hour Currently Unlimited 2
batch batch16 1-16 48 hours Currently Unlimited 4
batch32 17-32 24 hours Currently Unlimited 3

Example

Here is a MADmap example. After loading the desired PrgEnv module and the cmb module, the MADmap executable should be in your $PATH (you can verify this by typing "which MADmap"). To run MADmap on 64 cores (8 nodes with 8 cores each), one could submit the following script to run a job in the standard batch queue:

#PBS -S /bin/bash
#PBS -l nodes=8:ppn=8,pvmem=4gb
#PBS -l walltime=1:00:00
#PBS -N madmap_job
#PBS -q batch
#PBS -o madmap.log
#PBS -j oe
#PBS -V

cd $PBS_O_WORKDIR

mpiexec MADmap -r runconfig.xml -l

Example

Here is an example using mpiBatch to run 8 instances of IDL in batch-mode on 4 nodes (2 processes per node) with each process accessing 16GB of memory. The first step is to create a text file for each process containing a list of IDL commands (not a program definition). Something like this:

%> cat task1.idl
print,'IDL running task1'

Now we make a text file containing the commands to run on each process. In this case, we are having IDL execute the batch file for each of the 8 tasks:

%> cat idl_tasks
idl -e @task1.idl
idl -e @task2.idl
idl -e @task3.idl
idl -e @task4.idl
idl -e @task5.idl
idl -e @task6.idl
idl -e @task7.idl
idl -e @task8.idl

And finally we create the necessary PBS script which calls mpiBatch with this task list:

#PBS -S /bin/bash
#PBS -l nodes=4:ppn=2,pvmem=16gb
#PBS -l walltime=1:00:00
#PBS -N idl_job
#PBS -q batch
#PBS -o idl_job.log
#PBS -j oe
#PBS -V

cd $PBS_O_WORKDIR

mpiexec mpiBatch idl_tasks