Hello all,

## I rectenly installed ASPECT on a new institutional cluster.

##
– This is ASPECT, the Advanced Solver for Problems in Earth’s ConvecTion.

– . version 2.5.0-pre (main, 4c20fe0)

– . using deal.II 9.4.2

– . with 32 bit indices and vectorization level 1 (128 bits)

– . using Trilinos 13.2.0

– . using p4est 2.3.2

– . running in OPTIMIZED mode

– . running with 128 MPI processes

The codes (including cookbooks) run well on a single node from 16 to 56 cores, but as soon as I use more than one node, ASPECT randomly hangs at different time steps, regardless of adaptive gridding or not. Both OPTIMIZED and DEBUG versions exhibit the same behavior. For example, it hangs at:

Number of mesh deformation degrees of freedom: 451128

*** Timestep 56: t=2.79614e+06 years, dt=46144.5 years

Solving mesh displacement system… 13 iterations.

or at

Number of mesh deformation degrees of freedom: 451128

*** Timestep 13: t=650000 years, dt=50000 years

Solving mesh displacement system… 13 iterations.

Solving temperature system… 15 iterations.

Solving noninitial_plastic_strain system … 20 iterations.

Solving plastic_strain system … 19 iterations.

Solving crust_upper system … 16 iterations.

Solving crust_lower system … 17 iterations.

Solving mantle_lithosphere system … 15 iterations.

Solving asthenosphere system … 15 iterations.

or at

Number of mesh deformation degrees of freedom: 87039

Solving mesh displacement system… 0 iterations.

*** Timestep 0: t=0 years, dt=0 years

Solving mesh displacement system… 0 iterations.

Solving temperature system… 0 iterations.

Skipping noninitial_plastic_strain composition solve because RHS is zero.

Solving plastic_strain system … 0 iterations.

Solving crust_upper system … 0 iterations.

Solving crust_lower system … 0 iterations.

Solving mantle_lithosphere system … 0 iterations.

Solving asthenosphere system … 0 iterations.

Rebuilding Stokes preconditioner…

Solving Stokes system… 57+0 iterations.

Relative nonlinear residual (Stokes system) after nonlinear iteration 1: 1

Rebuilding Stokes preconditioner…

Solving Stokes system… 64+0 iterations.

Relative nonlinear residual (Stokes system) after nonlinear iteration 2: 0.0816698

Rebuilding Stokes preconditioner…

The cluster uses openmpi, 3.1.4 built with gnu 8.3. I do make sure to run on physical cores only and OMP_NUM_THREADS=1. I also tried with exclusive nodes to no avail. Any ideas?

Thanks,

Rob