This page describes ABI's computing infrastructure at a level suitable for researchers. For detailed system administration documentation, see Infrastructure.
A High-Performance Computing (HPC) cluster is a collection of interconnected computers (called nodes) that work together to run computationally intensive tasks. Instead of running everything on your laptop, you submit jobs to the cluster, which distributes them across available resources.
Key concepts:
| Term | Meaning |
|---|---|
| Node | A single server/computer in the cluster |
| Login node | The server you SSH into. Used for file management and job submission – not for heavy computation |
| Compute node | Servers dedicated to running jobs. Jobs are dispatched here by Slurm |
| Partition | A group of nodes with shared properties (e.g., memory size, GPU availability). Also called a “queue” |
| Job | A task you submit to run on a compute node |
| Slurm | The job scheduler that manages the queue and assigns resources |
| Component | Details |
|---|---|
| Login node(s) | ssh.abi.am (resolves to VMs ssh-01 and ssh-02) |
| Compute nodes | thin-01 (64C/384G), thin-02 (64C/384G), thick-01 (64C/768G) |
| Download nodes | dl-01 (2C/8G), dl-02 (2C/8G) |
| Total compute vCPUs | 192 |
| Total compute RAM | 1536G |
| Scheduler | Slurm (controller runs on a separate VM) |
| Virtualization | All nodes are bhyve VMs running on a FreeBSD physical host |
Partitions define groups of compute resources. When you submit a job, you can specify which partition to use.
| Partition | Nodes | CPUs | Total Memory | Default? | Purpose |
|---|---|---|---|---|---|
compute | thin-01, thin-02, thick-01 | 64 per node | 384G-768G | Yes | General purpose computation (default partition) |
thin | thin-01, thin-02 | 64 per node | ~384G each | No | Jobs that fit in standard memory |
thick | thick-01 | 64 | ~768G | No | Memory-intensive jobs (e.g., large genome assembly, pilon) |
download | dl-01, dl-02 | 2 per node | ~8G each | No | Data download tasks only (not for computation) |
Notes:
compute partition is the default. If you do not specify --partition, your job goes here.thick explicitly when you need more than ~384G of RAM (e.g., --partition=thick --mem=512G).download only for downloading data (e.g., SRA downloads). These nodes have minimal CPU and memory.thick-01 is in both compute and thick).To see current partition and node status:
sinfo
For a detailed view including memory and CPU allocation:
sinfo -N -o "%.10N %.10P %.5a %.4c %.20m %.20F %.10e"
Current cluster state (for reference):
NODELIST PARTITION CPUS MEMORY PURPOSE dl-01 download 2 ~8G Data downloads only dl-02 download 2 ~8G Data downloads only thick-01 compute/thick 64 ~768G High-memory computation thin-01 compute/thin 64 ~384G General computation thin-02 compute/thin 64 ~384G General computation
ABI has several storage areas. Understanding them is important for organizing your work and avoiding issues.
Storage is served from two ZFS-based NAS servers over NFS. ZFS provides transparent compression, so you do not need to manually compress old files – the filesystem handles it automatically. Home directories and selected projects are backed up to a separate server using ZFS send/recv.
| Path | Purpose | Served from | Quota | Notes |
|---|---|---|---|---|
/mnt/home/<user> | Home directory – configs, scripts | mustafar (nas1) | ~12G per user | Keep this small; use project/user dirs for data |
/mnt/nas0/user/<user> | Personal user workspace | geonosis (nas0) | ~100G per user | For personal datasets, experiments, conda envs |
/mnt/nas0/proj/<project> | Project data (some projects) | geonosis (nas0) | Per-project | *TODO: clarify which projects are on nas0 vs nas1* |
/mnt/nas1/proj/<project> | Project data (most projects) | mustafar (nas1) | Per-project (typically 14-25 TB) | Shared with all project members |
/mnt/nas1/db/ | Shared databases and reference genomes | mustafar (nas1) | ~32 TB total | Read-only for users. See Databases |
Example current usage:
/mnt/home/<user> ~12G quota (personal configs, scripts) /mnt/nas0/user/<user> ~100G quota (personal workspace) /mnt/nas1/proj/armwgs ~25 TB (Armenian WGS project) /mnt/nas1/proj/cfdna ~14 TB (cfDNA project) /mnt/nas1/db/ ~32 TB (reference genomes, indexes)
/mnt/nas0/user/<user> for personal data or /mnt/nas1/proj/<project> for project data.You (laptop) --SSH--> Login Node --sbatch--> Slurm Scheduler --> Compute Node(s)
sbatch.Important rules:
srun or salloc (see Interactive Sessions).| Command | Purpose |
|---|---|
sbatch script.sh | Submit a batch job |
squeue | View all jobs in the queue (see recommended format below) |
squeue –me | View only your jobs |
scancel <jobid> | Cancel a job |
sinfo | View partition and node status |
sacct -j <jobid> | View job accounting info after completion |
srun –pty bash | Start an interactive session |
The default squeue output is hard to read. We recommend this format:
squeue -o "%.6i %.10P %.10j %.15u %.10t %.10M %.10D %.20R %.3C %.10m"
Example output:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) CPU MIN_MEMORY 2313 compute computel anahit R 1:53:18 1 thin-01 20 35G 2293 compute kneaddata nelli R 11:12:15 1 thin-01 20 30G 2299 compute glasso_j1 davith R 11:12:15 1 thin-01 8 60G 2282 compute run_som.sh melina R 11:12:16 1 thin-01 8 50G 2309 compute plot_cover mherk PD 0:00 1 (Resources) 1 0 2121 thick pilon nate PD 0:00 1 (Nodes requi.. 4 512G
You can add this as an alias in your ~/.bashrc for convenience:
alias sq='squeue -o "%.6i %.10P %.10j %.15u %.10t %.10M %.10D %.20R %.3C %.10m"'
For a full guide, see Using Slurm.
All commonly used bioinformatics tools are installed globally on the cluster. There is no module system – tools are available directly by name:
# Check if a tool is available which bwa bwa --version which samtools samtools --version
See Software for a list of available tools.
If you need software that is not installed globally, you can install it locally using Conda.
Important: When using Conda, do not let it add itself to your~/.bashrc. This slows down every login for you and can cause issues on login nodes. Instead, activate Conda manually when you need it. See the Conda Guide for details.
ssh.abi.am from anywhere on the internet.