Cluster Specs

Cluster Specifications

BlueMoon

Body

A CPU cluster composed of 72 nodes, the majority connected with InfiniBand, totaling 9,512 CPU cores. This cluster supports large-scale computation, low-latency networking for MPI workloads, large memory systems, and high-performance parallel filesystems.

Hardware

  • The node400 class comprises 21 nodes, each equipped with two AMD EPYC 9654 CPUs (192 cores per node), 1.5 TB of RAM, 25 Gb Ethernet, and InfiniBand connectivity.  

  • The node300s class comprises 39 nodes, each equipped with two AMD EPYC 7763 CPUs (128 cores per node), 1 TB of RAM, 25 Gb Ethernet, and InfiniBand connectivity.  

  • The node200 class comprises 9 nodes, each with two Intel Xeon Gold 6230 CPUs (40 cores per node), 96 Gb of RAM, 10 Gb Ethernet, and no InfiniBand.  

  • The Large memory node is a single node equipped with two AMD EPYC 7543 CPUs (64 cores total), 4 TB of RAM, and 25 Gb Ethernet.  

  • The high clock class consists of 2 nodes with two AMD EPYC 7F52 CPUs (32 cores per node), 256 Gb of RAM, 25 Gb Ethernet, and InfiniBand connectivity.  

  • The Data Mountain class comprises 8 nodes, each with two Intel Xeon 6348 CPUs (56 cores per node), 8 TB of RAM, and 25 Gb Ethernet.  

Software

  • Operating System: RHEL 9.4, with GNU compilers (gcc, Fortran)
  • Resource Manager: Slurm version 24.05.4
  • Package Manager: Lmod

IceCore

Body

A GPU cluster comprised of 80 NVIDIA Tesla V100 GPUs in 10 nodes. Additionally, we have two nodes equipped with eight NVIDIA H100 GPUs and two nodes equipped with four NVIDIA A100 GPUs. This cluster will be upgraded in Fall 2025 with 64 NVIDIA H200 GPUs in 16 nodes.  16 nodes with four NVIDIA H200 141 GB GPUs, Intel Xeon 6740E CPUs (192 cores per node), 1 TB of RAM, 100 Gb Ethernet, and InfiniBand connectivity. 

Hardware

  • The H100 class comprises 2 nodes, each with four NVIDIA H100 80 GB GPUs, two Intel Xeon Platinum 8462Y+ CPUs (64 cores per node), 1 TB of RAM, 100 Gb Ethernet, and InfiniBand connectivity.  

  • The A100 class comprises 2 nodes, each with two NVIDIA A100 40 GB GPUs, two AMD EPYC 7763 CPUs (128 cores per node), 1 TB of RAM, 25 Gb Ethernet, and InfiniBand connectivity.  

  • The IceCore class comprises 10 nodes, each with eight NVIDIA V100 32 GB GPUs, two Intel Xeon Gold 6130 CPUs (32 cores per node), 768 GB of RAM, 10 Gb Ethernet, and no InfiniBand connectivity.  

Software

  • Operating System: RHEL 9.4, with GNU compilers (gcc, Fortran)
  • Resource Manager: Slurm version 24.05.4
  • Package Manager: Lmod
  • CUDA 11.4

DataMountain

Body

A large memory, sharded MongoDB to support near real-time access to enormous data files. This supports projects that require such speed to analyze, describe, and explain rapidly growing datasets effectively.  

Secure Cluster

Body

Three GPU Nodes, each equipped with an NVIDIA L40S GPU, an AMD EPYC 9354P, and 192GB of RAM.  One Large Memory Node with an AMD EPYC 9554P and 1.5TB of RAM. Two Proxmox Nodes, each equipped with an AMD EPYC 9354P and 384GB of RAM.  
Available end of 2025.