Perelman School of Medicine at the University of Pennsylvania
Perelman School of Medicine at the University of Pennsylvania

High Performance Computing Penn Medicine Academic Computing Services

Quick Start Technical Guide

Important Servers

PMACS cluster head node: consign.pmacs.upenn.edu
PMACS cluster file transfer server: mercury.pmacs.upenn.edu
PMACS VPN: https://juneau.med.upenn.edu/remote/login

If you are accessing the PMACS cluster from an off-campus location, you will have to create a secure tunnel via the PMACS VPN. You should be able to log into the above VPN link with your username and the new password you created.

Connecting to the PMACS Cluster

ssh <penn_key>@consign.pmacs.upenn.edu

Replace <penn_key> above with your PennKey; You will have to use your new PMACS password. You can also setup PublicKeys for added security and convenience. Instructions to generate and use PublicKeys is at the end of this document.

Setting up your profile

The LSF commands (a.k.a "b-commands") will only work if the LSF profile file is sourced. We recommend adding the following to your .bash_profile file if it doesn't already exist

if [ -f /usr/share/lsf/conf/profile.lsf ]; then
        source /usr/share/lsf/conf/profile.lsf
fi

Overview of LSF Commands

To check the various queues run bqueues:

$ bqueues

QUEUE_NAME

PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP
normal 30 Open:Active - - - - 331

0

331

0

interactive 30 Open:Active - - - - 3 0 3 0
plus 30 Open:Active - - - - 0 0 0 0
max_mem30 30 Open:Active - - - - 66 0 66 0
max_mem64 30 Open:Active - - - - 0 0 0 0
denovo 30 Open:Active - - - - 31 0 31 0

To get detailed information about a certain queue, run:

$ bqueues -l normal

QUEUE: normal
 -- Queue for normal workload taking less than 3GBytes of memory. Jobs that allocate more than 4GBytes of memory will be killed in this queue.  This is the default queue.

Parameters/Statistics

PRIO NICE STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SSUSP USUSP RSV
30 20 Open:Active - - - - 330 0 330 0 0 0

Interval for a host to accept two jobs is 0 seconds

Scheduling Parameters

  r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -

Scheduling Policies: NO_INTERACTIVE

Users: all
Hosts: compute/
RES_REQ: rusage[mem=3000]

To get information on the physical compute hosts that are a part of this cluster:
$ bhosts

Or if you know the name of the node
$ bhosts node001.hpc.local

HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
node001.hpc.local ok - 32 0 0 0 0 0

The above output says there are maximum of 32 available CPU SLOTS on the node and no current jobs running on it.

The output of bhosts below shows 27 jobs assigned and currently running on this node.

$ bhosts node048.hpc.local

HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
node048.hpc.local ok - 32 27 27 0 0 0

The output below shows that the node is closed since the number of jobs running on the node is equal to the maximum CPU SLOTS  allotment for the node.

$ bhosts node025.hpc.local

HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
node025.hpc.local closed - 32 32 32 0 0 0

To run a job in batch mode:
$ bsub <script_name>

Example:
$ bsub sh sleep.sh
Job <9990021> is submitted to default queue <normal>.

For interactive sessions:
$ bsub -Is bash
Job <9990022> is submitted to default queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on node062.hpc.local>>

Checking the status of running jobs:
$ bjobs -u <your_username>

Example:
$ bjobs -u asrini

JOB ID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
9990022 asrini RUN interactiv consign.hpc node062.hpc bash  Jan 14 15:38

Checking status of finished jobs:
$ bjobs -d -u <your_username>

Example:
$ bjobs -d -u asrini

JOB ID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
9990020 asrini DONE normal consign.hpc node010.hpc sleep 2 Jan 14 15:34
9990021 asrini DONE normal consign.hpc node010.hpc * sleep.sh Jan 14 15:35
9990022 asrini DONE interactiv consign.hpc node062.hpc bash Jan 14 15:38

Historical information about your jobs can be found by running:
$ bhist -d -u <your_username> 

Example output:
$ bhist -d -u asrini

Summary of time in seconds spent in various states:

JOB ID USER JOB_NAME PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL
9990019 asrini bash 1 0 36 0 0 0 37
9990020 asrini sleep 2 2 0 2 0 0 0 4
9990021 asrini * sleep.sh 2 0 25 0 0 0 27
9990022 asrini bash 0 0 395 0 0 0 395

Parallel Environment

To run a parallel job you would include the -n flag to the busb command above.

For example, to run an interactive job with 16 CPUs:
$ bsub -n 16 -Is bash
Job <9990023> is submitted to default queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on node063.hpc.local>>
$ bjobs -u asrini

JOB ID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
9990023 asrini RUN interactiv consign.hpc node063.hpc bash  Jan 14 15:50

Similarly, to run a batch job with 16 CPUs:
$ bsub -n 16 <my_parallel_job> 

Environment Modules

User loadable modules are available if the system default packages don't meet your requirements. To know what modules are available, you'll need to run the "module avail" command from an interactive session. To see what modules are available:
[asrini@consign ~]$ bsub -Is bash
Job <9990024> is submitted to default queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on node063.hpc.local>>  

[asrini@node063 ~]$ module avail

------------------------------ /usr/share/Modules/modulefiles -------------------------------
NAMD-2.9-Linux-x86_64-multicore dot                             module-info                     picard-1.96                     rum-2.0.5_05
STAR-2.3.0e                     java-sdk-1.6.0                  modules                         pkg-config-path                 samtools-0.1.19
STAR-hg19                       java-sdk-1.7.0                  mpich2-x86_64                   python-2.7.5                    use.own
STAR-mm9                        ld-library-path                 null                            r-libs-user
bowtie2-2.1.0                   manpath                         openmpi-1.5.4-x86_64            ruby-1.8.7-p374
devtoolset-2                    module-cvs                      perl5lib                        ruby-1.9.3-p448

 

Instructions for generating Public-Private keypairs

On Mac OS X and GNU/Linux systems, run the following command from within a terminal and follow the on-screen instructions:

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key ($HOME/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in $HOME/.ssh/id_rsa.
Your public key has been saved in $HOME/.ssh/id_rsa.pub.
The key fingerprint is:
xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx asrini@
The key's randomart image is:
+--[ RSA 2048]----+
|          .      |
|        kjweo    |
|         x B E x |
|          * B l +|
|        S +aser .|
|           +  +  |
|          .  weq |
|           . x 12|
|            45+  |
+-----------------+

On Windows machines you can generate and use PublicKeys with Putty. Here is a link to a Youtube channel which has video tutorials for generating and using Public keys.

After generating a Public-Private keypair, copy the contents of the .ssh/id_rsa.pub file to a file named .ssh/authorized_keys in your home area on the PMACS cluster.

[$USER@consign ~]$ vim .ssh/authorized_keys

One SSH public key per line; save and close the file.

Then change the permissions on the file:
[$USER@consign ~]$ chmod 600 .ssh/authorized_keys