Cluster Hardware
Queue | Nodes | CPUs | RAM | Accelerators (if any) | Notes |
---|---|---|---|---|---|
astro2_long astro2_short astro2_devel |
70 | 2x 24-core Xeon 6248R @ 3.0GHz | 192 GB - 4 GB / core. DDR4-2933 MHz |
- L2: 1MB/core, L3: 35.75MB/socket, newest supported extension: AVX-512. |
|
astro_verylong astro_long astro_short astro_devel |
139 | 2x 10-core Xeon E5-2680v2 @ 2.8GHz | 64 GB - 3.2 GB / core. DDR3-1866 MHz |
100 nodes: none 39 nodes: 2x Xeon Phi 5110P |
- L2: 256KB/core, L3: 25MB total, newest supported extension: AVX. - Xeon Phi cards have 60 cores at ~1 GHz, 8GB RAM, 30MB L2 cache. |
astro_gpu |
30 | 2x 4-core Xeon E5640 @ 2.67GHz | 72 GB - 9 GB / core. DDR3-1333 MHz |
20 nodes: 4x Tesla C2050 |
- L2: 256KB/core, L3: 12MB total, newest supported extension: SSE4.2. - Tesla C2050: 3072MB RAM, 448 thread procs., CUDA compute capability v2.0 (Fermi). |
astro2_gpu |
1 | 2x 16-core AMD 7302 @ 3.0 GHz | 512 GB - 16 GB / core. DDR4-3200 MHz |
20 nodes: 4x Tesla Ampere A100 |
- L2: 512KB / core, L3: 128MB / socket, newest supported extension: AVX2. - Tesla A100: 40 GB RAM, 3456 DP cores, 108 SMs, CUDA compute capability v8.0. |
astro_fe | 4 | 4x 8-core Xeon E5-4650 @ 2.70GHz | 768 GB - 24 GB / core | Quadro K2000 (but they are not set up for GPU computing) |
- L2: 256KB/core, L3: 20MB total, newest supported extension: AVX. - provides queue access to frontend/analysis nodes (astro06 - astro09). |
Notes
- See here for information on the different queues.
- Individual queues are interconnected with inifiniband network.
- The 48 core astro2 nodes have EDR (100 Gbit/s) with a 2:1 blocking factor and 24 nodes per switch (3 uplink switches, 1 core switch).
- The 20 core astro nodes have FDR (56 Gbit/s) with a 2:1 blocking factor and 24 nodes per switch (6 uplink switches, 2 core switches).
- The astro_gpu nodes have QDR (Quad Data Rate) infiniband.
- The astro_gpu nodes are based on a Supermicro SuperServer SYS-7046GT-TRF-TC4 node with dual quad-core Westmere CPUs.
- The astro_verylong|long|short|devel nodes are based on a Dell shoe-box design with a dual 10-core ivy-bridge CPU. The plain nodes are Dell C6220II, while the nodes with Xeon-Phi cards are Dell C8220x.
- The astro2_long|short|devel nodes are Huawei Fusion server pro X6000. A shoe-box design with a dual 24-core cascade-lake CPU.
Analysis Hardware
Name | CPUs | RAM | GPUs | Local Scratch Space (if any) | Notes |
---|---|---|---|---|---|
astro01.hpc.ku.dk | 48 cores: 2x AMD Epyc Rome 7F72 @ 3.2GHz | 1 TB DDR4-3200 MHz - 21 GB / core | 4x Ampere A100 with 40GB memory per GPU |
- L2: 512KB/core, L3: 192MB / socket, newest supported extension: AVX2. |
|
astro06.hpc.ku.dk | 32 cores: 4x Xeon E5-4650 @ 2.70GHz | 768 GB - 24 GB / core |
Quadro K2000 | /scratch: 2.7TB RAID0 SSD | - GPU for remote HW rendering through VirtualGL. |
astro07.hpc.ku.dk astro08.hpc.ku.dk astro09.hpc.ku.dk |
32 cores: 4x Xeon E5-4650 @ 2.70GHz | 768 GB - 24 GB / core | Quadro K2000 | - GPU for remote HW rendering through VirtualGL. |
Notes
Storage
- The home directory (/groups/astro) is a fully backed up lustre filesystem. We have a shared 6TB quota and individual quotas of 50 GB per user.
- The scratch directory (/lustre/astro) is a ZFS based high performance lustre filesystem with dedicated hardware for our group. No quotas are enforced and the total space (disregarding the transparent compression) is 670 TB.
- Archive (/users/astro/archive) is a symbolic link to the two ZFS filesystems exported as NFS volumes from a storage server connected to the clusters with a 10 Gbit/s network connection. Each archive system can be found under /users/astro/archive0 and /users/astro/archive1. These filesystems are old and new users will not get directories on them.
- archive0 has 215TB spread over 90 3TB disks. It uses RAIDZ2 for redundancy in stripes of 16+2 disks.
- archive1 has 172TB spread over 37 6TB disks. It uses RAIDZ2 for redundancy in stripes of 16+2 disks with 1 hot spare.
- External connection: The local HPC center is a Tier-1 CERN node and has a dual 100 gbit/s connection. In practice we easily reach 100 MB/s for transfer of larger files, with higher speeds possible by doing parallel transfers
- Internal connections: all frontends and cluster nodes are inter-connected with infiniband networking running at QDR, FDR or EDR speeds.