Computing nodes
Computing nodes can only be accessed with the job manager (SLURM) ssh to nodes is only possible when you have jobs on nodes
to know the state of all nodes, use the check-cluster
command
Spirit cluster (2023/04/01)
Number of core is without HyperThreading ( This is real core )
- 2 nodes type :
- zen4 : 64 cores 256GB 4GB/core (2 AMD EPYC 7452 32-Core)
- 240GB max allocatable for job and 4000M ( not 4096 ) per core
- there is also 4 nodes with 96 cores and 256GB (2 AMD EPYC 7552 48-Core)
- zen16 : 32 cores 512GB 16GB/core (2 AMD EPYC 7302 16-Core)
- 496GB max allocatable for job and 16000M ( not 16384) per core
- 20 Computing nodes
- 12 zen4
- 8 zen16
Spiritx cluster (2023/01/20)
- 2 nodes type :
- zen4 :
- 64 cores 256GB 4GB/core (2 AMD EPYC 7452 32-Core)
- 240GB max allocatable for job and 4000M ( not 4096 ) per core
- there is also 4 nodes with 96 cores and 256GB (2 AMD EPYC 7552 48-Core)
- zen16 : 32 cores 512GB 16GB/core (2 AMD EPYC 7302 16-Core)
- 496GB max allocatable for job and 16000M ( not 16384) per core
- 14 Computing nodes
- 10 zen4
- 4 zen16
check-cluster sample
user@spirit1:~$ check-cluster
NODE STATE FREE-CPU FREE-MEM LOAD PARTITION GRES REASON
spirit64-01 mix 42/64 162/240 Gb 23.23 zen4* (null)
spirit64-02 mix 49/64 189/240 Gb 6.64 zen4* (null)
spirit64-03 idle 64/64 240/240 Gb 0.90 zen4* (null)
spirit64-04 idle 64/64 240/240 Gb 0.01 zen4* (null)
spirit64-05 idle 64/64 240/240 Gb 0.01 zen4* (null)
spirit64-06 mix 7/64 23/240 Gb 24.97 zen4* (null)
spirit64-07 idle 64/64 240/240 Gb 0.14 zen4* (null)
spirit64-08 idle 64/64 240/240 Gb 0.00 zen4* (null)
spirit64-09 idle 96/96 240/240 Gb 0.00 zen4* (null)
spirit64-10 idle 96/96 240/240 Gb 0.00 zen4* (null)
spirit64-11 idle 96/96 240/240 Gb 0.00 zen4* (null)
spirit64-12 mix 32/96 140/240 Gb 64.14 zen4* (null)
spirit32-01 mix 14/32 273/496 Gb 4.90 zen16 (null)
spirit32-02 alloc 0/32 256/496 Gb 4.35 zen16 (null)
spirit32-03 idle 48/48 496/496 Gb 0.00 zen16 (null)
spirit32-04 idle 48/48 496/496 Gb 0.01 zen16 (null)
active jobs : 38 , pending jobs: 0
active cpu : 208 of 1056 (20%)
active nodes: 16 of 16 (100%)
- MaxTRESPerUser=cpu=96,mem=262544 MAXJOBS=64 MaxSubmitJobs=1000
limit for external user are :
- MaxTRESPerUser=cpu=32,mem=131272 MAXJOBS=32 MaxSubmitJobs=500