====== Infrastructure Overview ====== This page provides a high-level overview of ABI's computing and network infrastructure. For user-facing documentation, see [[getting_started:cluster_basics|Cluster Basics]]. ===== Architecture ===== ABI's infrastructure runs on **FreeBSD physical hosts** with compute workloads running in **bhyve VMs**. Services are isolated using **FreeBSD jails**. ┌──────────────┐ │ Internet │ └──────┬───────┘ │ ┌──────┴───────┐ │ Firewall │ │ / Router │ └──────┬───────┘ │ ┌───────────────────────────┼──────────────────────────┐ │ │ │ ┌─────┴──────────────┐ ┌────────┴─────────────┐ ┌────────┴─────────────┐ │ GEONOSIS (nas0) │ │ BANE │ │ MUSTAFAR (nas1) │ │ FreeBSD │ │ FreeBSD │ │ FreeBSD │ │ │ │ │ │ │ │ ZFS: genomic data │ │ Jails: │ │ ZFS: home, user, │ │ /mnt/nas0/proj │ │ - nginx │ │ proj, db │ │ /mnt/nas0/user │ │ - LDAP │ │ /mnt/home │ │ │ │ - Forgejo (git) │ │ /mnt/nas1/proj │ │ bhyve VMs: │ │ - PostgreSQL │ │ /mnt/nas1/db │ │ - ssh-01 (2C/4G) │ │ - MySQL │ │ │ │ - ssh-02 (4C/8G) │ │ - DNS │ └──────────────────────┘ │ - thin-01(64C/384G) │ - DHCP │ │ - thin-02(64C/384G) │ │ ┌───────────────────────┐ │ - thick-01(64C/768G) │ Devuan VM on bhyve: │ │ HOTH (bak1) │ │ - dl-01 (2C/8G) │ │ - slurmctld │ │ FreeBSD │ │ - dl-02 (2C/8G) │ │ │ │ │ │ - rshiny0 (4C/16G) └──────────────────────┘ │ ZFS backups via │ │ - rshiny1 (4C/16G) │ zelta (send/recv) │ │ - (+ legacy VMs) │ │ homes + some projects│ └────────────────────┘ └───────────────────────┘ ---- ===== Physical Hosts ===== ABI runs on **4 physical FreeBSD servers**: ^ Hostname ^ Alias ^ Role ^ OS ^ Key Function ^ | **geonosis** | nas0, genomic | HPC hypervisor + NAS | FreeBSD | 2x AMD EPYC 7702 (128C/256T). Runs all bhyve VMs. Hosts genomic data ZFS pool. | | **mustafar** | nas1 | Primary NAS | FreeBSD | ZFS storage for /mnt/home, /mnt/nas1/proj, /mnt/nas1/db. Serves NFS to all VMs. | | **bane** | -- | IT services server | FreeBSD | Runs all infrastructure services in jails (LDAP, nginx, git, DNS, DHCP, databases). Hosts slurmctld in a Devuan VM. | | **hoth** | bak1 | Backup server | FreeBSD | ZFS backup target using [[https://zelta.space|zelta]] (ZFS send/recv). Backs up homes and selected projects. | ---- ===== Virtual Machines ===== All HPC VMs run on **geonosis** using **bhyve**. The Slurm controller runs in a Devuan VM on **bane**. ==== User-facing VMs (on geonosis) ==== ^ VM ^ vCPUs ^ RAM ^ OS ^ Slurm Partition ^ Purpose ^ | ssh-01 | 2 | 4G | *TODO* | N/A | Login node (''ssh.abi.am'') | | ssh-02 | 4 | 8G | *TODO* | N/A | Login node (''ssh.abi.am'') | | thin-01 | 64 | 384G | *TODO: Devuan/Ubuntu?* | compute (default), thin | General computation | | thin-02 | 64 | 384G | *TODO* | compute (default), thin | General computation | | thick-01 | 64 | 768G | *TODO* | compute (default), thick | High-memory computation | | dl-01 | 2 | 8G | *TODO* | download | Data downloads only | | dl-02 | 2 | 8G | *TODO* | download | Data downloads only | | rshiny0 | 4 | 16G | *TODO* | N/A | R Shiny server | | rshiny1 | 4 | 16G | *TODO* | N/A | R Shiny server | ==== Infrastructure VMs ==== ^ VM ^ Host ^ vCPUs ^ RAM ^ OS ^ Purpose ^ | slurmctld VM | bane | *TODO* | *TODO* | Devuan | Slurm controller daemon (slurmctld) | ==== Legacy / testing VMs (on geonosis) ==== These VMs are for internal IT use. *TODO: Document when needed.* ^ VM ^ vCPUs ^ RAM ^ Notes ^ | comp0 | 32 | 32G | Stopped -- legacy compute node | | dna0 | 8 | 32G | Stopped -- *TODO* | | mitte-dev-01 | 2 | 4G | Stopped -- development/testing | ---- ===== Services (on bane) ===== All infrastructure services run in **FreeBSD jails** on bane (managed via ''jailer''), providing isolation and easy management. === Public-facing jails === ^ Jail ^ Hostname ^ IPv4 ^ Service ^ | www-01 | www-01.abi.am | 37.26.174.181 | nginx (web server, reverse proxy) | | git-01 | git-01.abi.am | 37.26.174.182 | Forgejo (Git hosting) | | mx-01 | mx-01.abi.am | 37.26.174.190 | Mail server | === Internal jails === ^ Jail ^ Hostname ^ Network ^ Service ^ | ldap-01 | ldap-01.local.abi.am | 172.20.42.0/24 | LDAP (slapd). See [[infra:ldap|LDAP]] | | dns-01 | dns-01.local.abi.am | 172.20.42.0/24, 172.20.200.0/24 | DNS | | dhcp-01 | dhcp-01.local.abi.am | 172.20.42.0/24, 172.20.200.0/24 | DHCP | | psql-01 | psql-01.local.abi.am | 172.20.200.0/24 | PostgreSQL | | mysql-01 | mysql-01.local.abi.am | 172.20.200.0/24 | MySQL | | adg-01 | adg-01.local.abi.am | 172.20.200.0/24 | AdGuard (DNS filtering) | | nms-01 | nms-01.local.abi.am | 172.20.200.0/24, 172.20.42.0/24 | Network monitoring | | unifi-01 | unifi-01.local.abi.am | 172.20.200.0/24 | UniFi controller (WiFi) | | wiki-01 | wiki-01.local.abi.am | 172.20.200.0/24 | DokuWiki | === Infrastructure VM === ^ VM ^ Host ^ OS ^ Service ^ | slurmctld | bane | Devuan | Slurm controller (port 6817) | For the full jail list including stopped jails, see [[infra:servers#bane|Server Inventory: bane]]. ---- ===== Network ===== See [[infra:network|Network]] for detailed network documentation. Summary: * **Public subnet:** ''37.26.174.176/28'' -- internet-facing services (www-01, git-01, mx-01, ssh.abi.am) * **Internal subnet 1:** ''172.20.42.0/24'' -- core infrastructure services (LDAP, DNS, DHCP) * **Internal subnet 2:** ''172.20.200.0/24'' -- internal services (databases, monitoring, wiki, UniFi, AdGuard) * **DNS:** internal DNS served by ''dns-01'' jail on bane * **DHCP:** served by ''dhcp-01'' jail on bane * **WiFi management:** UniFi controller on ''unifi-01'' jail * *TODO: VLAN segmentation details, switch configuration, inter-subnet routing.* ---- ===== Storage Architecture ===== ABI uses **two ZFS-based NAS servers** with storage served over NFS. ZFS provides transparent compression, snapshots, and data integrity. ==== mustafar (nas1) -- Primary user storage ==== ^ Mount Point ^ ZFS Dataset ^ Capacity ^ Purpose ^ Quota ^ | ''/mnt/home'' | ''nas1:/znas1/abi/home'' | ~32 TB pool | User home directories | ~12G per user | | ''/mnt/nas1/proj'' | ''nas1:/znas1/abi/proj'' | ~32 TB pool | Shared project data | Per-project | | ''/mnt/nas1/db'' | ''nas1:/znas1/abi/collections/db'' | ~32 TB | Reference genomes & databases | Read-only for users | ==== geonosis (nas0) -- Genomic data + user workspaces ==== ^ Mount Point ^ ZFS Dataset ^ Capacity ^ Purpose ^ Quota ^ | ''/mnt/nas0/user'' | ''nas0:/znas0/abi/user'' *TODO: confirm dataset path* | *TODO* | Personal user workspaces | ~100G per user | | ''/mnt/nas0/proj'' | *TODO: dataset path* | *TODO* | Additional project storage | Per-project | ==== hoth (bak1) -- Backups ==== ^ What ^ Backed up? ^ Method ^ | User home directories (''/mnt/home'') | Yes | ZFS send/recv via [[https://zelta.space|zelta]] | | Some project directories | Selected projects only | ZFS send/recv via zelta | | Reference databases (''/mnt/nas1/db'') | *TODO* | *TODO* | ---- ===== Slurm Architecture ===== ^ Component ^ Where ^ Notes ^ | **slurmctld** (controller) | Devuan VM on **bane** | Manages the job queue and scheduling | | **slurmd** (compute daemon) | thin-01, thin-02, thick-01, dl-01, dl-02 | Runs on each compute/download VM | | **Configuration** | *TODO: path to slurm.conf* | *TODO* | | **Slurm version** | *TODO* | | ---- ===== Operating Systems ===== ^ Role ^ Machine ^ OS ^ | Physical hosts | geonosis, mustafar, bane, hoth | FreeBSD *TODO: version* | | Login VMs | ssh-01, ssh-02 | *TODO: Devuan? Ubuntu?* | | Compute VMs | thin-01, thin-02, thick-01 | *TODO: Devuan? Ubuntu?* | | Download VMs | dl-01, dl-02 | *TODO* | | R Shiny VMs | rshiny0, rshiny1 | *TODO* | | Slurm controller VM | slurmctld (on bane) | Devuan | ---- ===== Key Design Decisions ===== * **FreeBSD everywhere** for physical hosts -- provides ZFS, bhyve, and jails natively. * **bhyve** for virtualization -- all HPC workloads run in VMs on geonosis, providing isolation and resource control. * **FreeBSD jails** for services on bane -- lightweight isolation for LDAP, web, git, databases, DNS, DHCP. * **ZFS** for all storage -- transparent compression, snapshots, send/recv for backups. * **zelta** for backups -- ZFS replication from mustafar/geonosis to hoth. * **No environment modules** -- all bioinformatics software installed globally on compute VMs. ---- ===== Related Pages ===== * [[infra:servers|Server Inventory]] -- Detailed specs per machine * [[infra:changelog|Changelog]] -- Infrastructure change log * [[infra:network|Network]] -- Network topology and configuration * [[infra:ldap|LDAP Configuration]] -- User directory setup * [[infra:monitoring|Monitoring]] -- System health monitoring * [[infra:automation|Automation]] -- Configuration management * [[infra:tips_and_tricks|Admin Tips & Tricks]] -- Quick admin commands