Quick Start Technical Guide
Important Servers
PMACS cluster head node: consign.pmacs.upenn.edu
PMACS cluster file transfer server: mercury.pmacs.upenn.edu
PMACS VPN: https://juneau.med.upenn.edu/remote/login
If you are accessing the PMACS cluster from an off-campus location, you will have to create a secure tunnel via the PMACS VPN. You should be able to log into the above VPN link with your username and the new password you created.
Connecting to the PMACS Cluster
ssh <penn_key>@consign.pmacs.upenn.edu
Replace <penn_key> above with your PennKey; You will have to use your new PMACS password. You can also setup PublicKeys for added security and convenience. Instructions to generate and use PublicKeys is at the end of this document.
Setting up your profile
The LSF commands (a.k.a "b-commands") will only work if the LSF profile file is sourced. We recommend adding the following to your .bash_profile file if it doesn't already exist
if [ -f /usr/share/lsf/conf/profile.lsf ]; then
source /usr/share/lsf/conf/profile.lsf
fi
Overview of LSF Commands
To check the various queues run bqueues:
$ bqueues
QUEUE_NAME |
PRIO | STATUS | MAX | JL/U | JL/P | JL/H | NJOBS | PEND | RUN | SUSP |
---|---|---|---|---|---|---|---|---|---|---|
normal | 30 | Open:Active | - | - | - | - | 331 |
0 |
331 |
0 |
interactive | 30 | Open:Active | - | - | - | - | 3 | 0 | 3 | 0 |
plus | 30 | Open:Active | - | - | - | - | 0 | 0 | 0 | 0 |
max_mem30 | 30 | Open:Active | - | - | - | - | 66 | 0 | 66 | 0 |
max_mem64 | 30 | Open:Active | - | - | - | - | 0 | 0 | 0 | 0 |
denovo | 30 | Open:Active | - | - | - | - | 31 | 0 | 31 | 0 |
To get detailed information about a certain queue, run:
$ bqueues -l normal
QUEUE: normal
-- Queue for normal workload taking less than 3GBytes of memory. Jobs that allocate more than 4GBytes of memory will be killed in this queue. This is the default queue.
Parameters/Statistics
PRIO | NICE | STATUS | MAX | JL/U | JL/P | JL/H | NJOBS | PEND | RUN | SSUSP | USUSP | RSV |
---|---|---|---|---|---|---|---|---|---|---|---|---|
30 | 20 | Open:Active | - | - | - | - | 330 | 0 | 330 | 0 | 0 | 0 |
Interval for a host to accept two jobs is 0 seconds
Scheduling Parameters
r15s | r1m | r15m | ut | pg | io | ls | it | tmp | swp | mem | |
---|---|---|---|---|---|---|---|---|---|---|---|
loadSched | - | - | - | - | - | - | - | - | - | - | - |
loadStop | - | - | - | - | - | - | - | - | - | - | - |
Scheduling Policies: NO_INTERACTIVE
Users: all
Hosts: compute/
RES_REQ: rusage[mem=3000]
To get information on the physical compute hosts that are a part of this cluster:
$ bhosts
Or if you know the name of the node
$ bhosts node001.hpc.local
HOST_NAME | STATUS | JL/U | MAX | NJOBS | RUN | SSUSP | USUSP | RSV |
---|---|---|---|---|---|---|---|---|
node001.hpc.local | ok | - | 32 | 0 | 0 | 0 | 0 | 0 |
The above output says there are maximum of 32 available CPU SLOTS on the node and no current jobs running on it.
The output of bhosts below shows 27 jobs assigned and currently running on this node.
$ bhosts node048.hpc.local
HOST_NAME | STATUS | JL/U | MAX | NJOBS | RUN | SSUSP | USUSP | RSV |
---|---|---|---|---|---|---|---|---|
node048.hpc.local | ok | - | 32 | 27 | 27 | 0 | 0 | 0 |
The output below shows that the node is closed since the number of jobs running on the node is equal to the maximum CPU SLOTS allotment for the node.
$ bhosts node025.hpc.local
HOST_NAME | STATUS | JL/U | MAX | NJOBS | RUN | SSUSP | USUSP | RSV |
---|---|---|---|---|---|---|---|---|
node025.hpc.local | closed | - | 32 | 32 | 32 | 0 | 0 | 0 |
To run a job in batch mode:
$ bsub <script_name>
Example:
$ bsub sh sleep.sh
Job <9990021> is submitted to default queue <normal>.
For interactive sessions:
$ bsub -Is bash
Job <9990022> is submitted to default queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on node062.hpc.local>>
Checking the status of running jobs:
$ bjobs -u <your_username>
Example:
$ bjobs -u asrini
JOB ID | USER | STAT | QUEUE | FROM_HOST | EXEC_HOST | JOB_NAME | SUBMIT_TIME |
---|---|---|---|---|---|---|---|
9990022 | asrini | RUN | interactiv | consign.hpc | node062.hpc | bash | Jan 14 15:38 |
Checking status of finished jobs:
$ bjobs -d -u <your_username>
Example:
$ bjobs -d -u asrini
JOB ID | USER | STAT | QUEUE | FROM_HOST | EXEC_HOST | JOB_NAME | SUBMIT_TIME |
---|---|---|---|---|---|---|---|
9990020 | asrini | DONE | normal | consign.hpc | node010.hpc | sleep 2 | Jan 14 15:34 |
9990021 | asrini | DONE | normal | consign.hpc | node010.hpc | * sleep.sh | Jan 14 15:35 |
9990022 | asrini | DONE | interactiv | consign.hpc | node062.hpc | bash | Jan 14 15:38 |
Historical information about your jobs can be found by running:
$ bhist -d -u <your_username>
Example output:
$ bhist -d -u asrini
Summary of time in seconds spent in various states:
JOB ID | USER | JOB_NAME | PEND | PSUSP | RUN | USUSP | SSUSP | UNKWN | TOTAL |
---|---|---|---|---|---|---|---|---|---|
9990019 | asrini | bash | 1 | 0 | 36 | 0 | 0 | 0 | 37 |
9990020 | asrini | sleep 2 | 2 | 0 | 2 | 0 | 0 | 0 | 4 |
9990021 | asrini | * sleep.sh | 2 | 0 | 25 | 0 | 0 | 0 | 27 |
9990022 | asrini | bash | 0 | 0 | 395 | 0 | 0 | 0 | 395 |
Parallel Environment
To run a parallel job you would include the -n flag to the busb command above.
For example, to run an interactive job with 16 CPUs:
$ bsub -n 16 -Is bash
Job <9990023> is submitted to default queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on node063.hpc.local>>
$ bjobs -u asrini
JOB ID | USER | STAT | QUEUE | FROM_HOST | EXEC_HOST | JOB_NAME | SUBMIT_TIME |
---|---|---|---|---|---|---|---|
9990023 | asrini | RUN | interactiv | consign.hpc | node063.hpc | bash | Jan 14 15:50 |
Similarly, to run a batch job with 16 CPUs:
$ bsub -n 16 <my_parallel_job>
Environment Modules
User loadable modules are available if the system default packages don't meet your requirements. To know what modules are available, you'll need to run the "module avail" command from an interactive session. To see what modules are available:
[asrini@consign ~]$ bsub -Is bash
Job <9990024> is submitted to default queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on node063.hpc.local>>
[asrini@node063 ~]$ module avail
------------------------------ /usr/share/Modules/modulefiles -------------------------------
NAMD-2.9-Linux-x86_64-multicore dot module-info picard-1.96 rum-2.0.5_05
STAR-2.3.0e java-sdk-1.6.0 modules pkg-config-path samtools-0.1.19
STAR-hg19 java-sdk-1.7.0 mpich2-x86_64 python-2.7.5 use.own
STAR-mm9 ld-library-path null r-libs-user
bowtie2-2.1.0 manpath openmpi-1.5.4-x86_64 ruby-1.8.7-p374
devtoolset-2 module-cvs perl5lib ruby-1.9.3-p448
Instructions for generating Public-Private keypairs
On Mac OS X and GNU/Linux systems, run the following command from within a terminal and follow the on-screen instructions:
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key ($HOME/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in $HOME/.ssh/id_rsa.
Your public key has been saved in $HOME/.ssh/id_rsa.pub.
The key fingerprint is:
xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx asrini@
The key's randomart image is:
+--[ RSA 2048]----+
| . |
| kjweo |
| x B E x |
| * B l +|
| S +aser .|
| + + |
| . weq |
| . x 12|
| 45+ |
+-----------------+
On Windows machines you can generate and use PublicKeys with Putty. Here is a link to a Youtube channel which has video tutorials for generating and using Public keys.
After generating a Public-Private keypair, copy the contents of the .ssh/id_rsa.pub file to a file named .ssh/authorized_keys in your home area on the PMACS cluster.
[$USER@consign ~]$ vim .ssh/authorized_keys
One SSH public key per line; save and close the file.
Then change the permissions on the file:
[$USER@consign ~]$ chmod 600 .ssh/authorized_keys