Running Aspect on Stampede 3

I am running ASPECT on Stampede3, and the models are global mantle convection simulations with imposed kinematic boundary conditions, with ~393,216 active cells and total degrees of freedom of ~16.4 million.

I have tested different parallel configurations on Stampede3 (112 cores per node) and observed the following scaling behaviour:

  • 5 nodes, 560 MPI ranks → memory exhaustion error
  • 5 nodes, 520 MPI ranks → memory exhaustion error
  • 5 nodes, 448 MPI ranks → stable and appears to be the optimal configuration so far.

I would like suggestions on whether this behaviour is expected for ASPECT at this problem size.

Thank you for posting your findings on the forum!

I have not tested your model configuration but I can share my experience running large models on Stampede3. I was able to run instantaneous global mantle convection models with ~800 million Dofs using 32 nodes (2560 ranks) on the icx queue. The same model configuration gave me memory exhaustion error when I used 64 nodes (3072 ranks) on the skx queue.

Looking at your node configuration, I am guessing you are using spr nodes on Stampede3, which has lesser memory (128 GB) than the skx (192 GB) or icx (256 GB) nodes. You point out an interesting observation that the model does work when the memory available per rank is higher; I can’t tell you what exactly in the simulation would cause this behavior (i.e., have a copy of variable per rank and cause memory exhaustion), but I would recommend you to try a different queue or increase the number of nodes in your submit script so that you have more memory available per rank.

I hope this helps. Please don’t hesitate to ask further questions.

Best,
Arushi

@DebanjanPal1995 - Following on the suggestions from @arushi_saxena, can you tell us a bit more about your model or provide a PRM file?

Even when using nodes with less total memory, my sense is your models should not hit the limits as you still have less than 40,000 DOF per core. Are you by chance tracking a large number of particles in your simulations?

It will likely be helpful to use the memory statistics post processor to gain more insight into what part of the model is taking up large amounts of memory.

Thanks for your reply @arushi_saxena and @jbnaliboff.

@arushi_saxena I am using the spr nodes and I will test with icx or skx nodes.
@jbnaliboff I have shared the parameter file and I am not tracking particles. The input file is very similar to https://academic.oup.com/gji/article/237/3/1251/7616936#446886573 (Dannber et al. 2024).

input.prm (5.8 KB)

@DebanjanPal1995 - Thank you for sending over the PRM file. Would you also mind sending over the log.txt file as well?