How to better monitor the memory usage of ASPECT?

Hi,

I would like to get a better idea of how much memory is used by ASPECT for the individual tasks. Mainly because my memory consumption explodes much faster than I was expecting. I went back to the cookbook of shell_simple_3d.prm and modified it to use an Initial Global Refinement of 4 and no Adaptive Mesh Refinement. I am using the matrix statistics and memory statistics post processors to try to get some insights.

However, I do not seem to get what the individual output statistics tell me. I get the two outputs from the matrix-statistics post processor telling me about the Total system matrix memory consumption and Total system preconditioner matrix memory consumption and I get the output of the memory-statistics post processor about System matrix memory consumption. From here I run into the following problems:

  1. What is the link between the System matrix memory consumption and the Total system matrix memory consumption?

  2. What other processes add to the total memory used? Because combining all output statistics doesn’t get me anywhere near to the amount of memory what monitoring using htop shows me.

  3. What is the expected behavior of the memory after increasing or decreasing the numbers of cpu’s dedicated to the mpi run?

  4. I also tried changing the GMRES solver restart length, because this is necessary (for the solver to converge) in the simulation I want to do eventually. Increasing the length of the solver restart length should increase the memory consumption right? I do not observe this in the total memory consumption though.

I hope you can help me out with some of these problems. Also, I am more than happy to receive suggestions of how I can better monitor the memory usage.

Thanks in advance!

Sincerely,

Rens

Hi Rens,

I admit that we could do a better job with the memory statistics so I will try to explain things below. The question of memory consumption is also quite difficult for a couple of reasons:

  • Virtual memory and resident memory are sometimes quite far apart and resident memory is typically what you care about (because if that runs out, you crash or swap).
  • Memory consumption is different on every MPI rank and can vary quite a bit. Especially rank 0 is typically larger. If you run with 100 MPI ranks, collecting memory statistics can be quite complicated!
  • We don’t have perfect knowledge about memory needed, especially for some of the third party libraries used by deal.II.

I am not sure this is documented, but the values reported are from MPI rank 0 only. This is not ideal and should think about addressing that by reporting the maximum instead.

The latter is the sum over all MPI ranks.

Yes, that is not surprising. We don’t account for vectors (we might be able to, but it is somewhat complicated) or the preconditioner itself (there is no easy way to get that information for us). The rest we report or is typically small (well, except in some situations).

I would expect resident memory to roughly decrease linearly with the number of MPI ranks. Virtual memory is a bit more complicated.

Yes. This is not visible in the currently reported stats (except VmPeak).

My postdoc Conrad has looked into memory statistics in more detail, but I typically only look at VmPeak (from the memory statistics plugin). As an example (see [1907.06696] Comparison Between Algebraic and Matrix-free Geometric Multigrid for a Stokes Problem on Adaptive Meshes with Variable Viscosity for details) let me show you this table (look at the AMG column):

You can see the vectors with restart length 50 are about 35%, matrices around 50%, preconditioner around 11%. Everything else is not important. All things listed roughly scale with 1/ number of ranks. You can tell GMG is quite attractive to use if you can do so. Finally, this is simplified, as it is a stationary problem without temperature, compositional fields, etc…

I hope that helps. Let me know if you have further questions.

Best,
Timo