Good afternoon all,
I have some simulations that I previously ran successfully using Aspect 2.5.0 and I am now trying to run them using Aspect 3.1.0-pre. I am encountering an issue where the Stokes solver fails to converge on the first ‘expensive’ solver step, after a given number of ‘cheap’ solver steps. The simulations are global mantle convection simulations in 3D, set up similarly to those in Goldberg & Holt 2024 (https://doi.org/10.1029/2023GC011134), though with different input .txt files for the temperature initial condition and the composition, and with no adaptive refinement based on temperature or viscosity, only compositional fields (which I use to define regions close to subduction zones for high refinement). I’m using the ‘visco plastic’ material model, and I have a compositional field that defines plate boundaries so that we can impose a weak viscosity layer through the material model.
Here is part of the error message I get (full stacktrace and solver history are attached).
```
An error occurred in line <2844> of file </work2/10752/ghobson/stampede3/software/aspect/source/utilities.cc> in function
void aspect::Utilities::throw_linear_solver_failure_exception(const std::string &, const std::string &, const std::vector<SolverControl> &, const std::exception &, const MPI_Comm, const std::string &)
The violated condition was:
false
Additional information:
The iterative Stokes solver in
StokesMatrixFreeHandlerLocalSmoothingImplementation::solve did not
converge.
The initial residual was: 1.094939e+20
The final residual is: 1.082211e+19
The required residual for convergence is: 1.094939e+17
See
/scratch/10752/ghobson/Subduction_Model_3D/lumi_run/solver_history.txt
for the full convergence history.
The solver reported the following error:
--------------------------------------------------------
An error occurred in line <2844> of file
</work2/10752/ghobson/stampede3/software/aspect/source/utilities.cc>
in function
void aspect::Utilities::throw_linear_solver_failure_exception(const
std::string &, const std::string &, const std::vector<SolverControl>
&, const std::exception &, const MPI_Comm, const std::string &)
The violated condition was:
false
Additional information:
The iterative (bottom right) solver in
BlockSchurGMGPreconditioner::vmult did not converge.
The initial residual was: 9.996222e-01
The final residual is: 2.212388e-04
The required residual for convergence is: 9.996222e-07
The solver reported the following error:
--------------------------------------------------------
An error occurred in line <1338> of file
</work2/10752/ghobson/stampede3/software/deal.II-v9.5.2/deal.II-v9.5.2/include/deal.II/lac/solver_cg.h>
in function
void
dealii::SolverCG<dealii::LinearAlgebra::distributed::Vector<double,
dealii::MemorySpace::Host>>::solve(const MatrixType &, VectorType &,
const VectorType &, const PreconditionerType &) \[VectorType =
dealii::LinearAlgebra::distributed::Vector<double,
dealii::MemorySpace::Host>, MatrixType =
aspect::MatrixFreeStokesOperators::MassMatrixOperator<3, 1,
GMGNumberType>, PreconditionerType = dealii::PreconditionMG<3,
dealii::LinearAlgebra::distributed::Vector<double,
dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<3, double>>\]
The violated condition was:
solver_state == SolverControl::success
Additional information:
Iterative method reported convergence failure in step 100. The
residual in the last step was 0.000221239.
This error message can indicate that you have simply not allowed a
sufficiently large number of iterations for your iterative solver to
converge. This often happens when you increase the size of your
problem. In such cases, the last residual will likely still be very
small, and you can make the error go away by increasing the allowed
number of iterations when setting up the SolverControl object that
determines the maximal number of iterations you allow.
The other situation where this error may occur is when your matrix is
not invertible (e.g., your matrix has a null-space), or if you try to
apply the wrong solver to a matrix (e.g., using CG for a matrix that
is not symmetric or not positive definite). In these cases, the
residual in the last iteration is likely going to be large.
```
And here is the part of the parameter file where I set up my solver parameters:
```
set Nonlinear solver scheme = single Advection, iterated Stokes
set Nonlinear solver tolerance = 1.0e-2
set Max nonlinear iterations = 30
subsection Solver parameters
subsection Stokes solver parameters
set Linear solver A block tolerance = 1e-1
set Linear solver tolerance = 1e-3
set Maximum number of expensive Stokes solver steps = 4000
set Number of cheap Stokes solver steps = 300
set GMRES solver restart length = 200
end
set Temperature solver tolerance = 1e-7
set Composition solver tolerance = 1e-7
end
```
I am wondering if there is either (a) some difference between 2.5 and 3.1 in how I should set up my solver parameters, or (b) something wrong with my Aspect installation.
I am running these simulations on Stampede3 using version 3.1.0-pre (main, 513c8574f), deal.II 9.5.2, Trilinos 15.0.0, p4est 2.8.5, Geodynamic World Builder 1.0.0. I’m running in release mode on one node with 48 MPI processes, but I have run a bigger problem (460 million DOFs on 8 nodes) and encountered the same issue. I have also attached here the steps I took to install Aspect 3 on Stampede3, in case that is helpful.
If anyone has advice for how I should investigate this, I would really appreciate it.
Best regards,
Gabrielle Hobson
ASPECT_Installation_on_Stampede3.txt (9.2 KB)
solver_history.txt (4.6 KB)
lumi.e2254169.txt (15.5 KB)
lumi.o2254169.txt (2.5 KB)
simulation.prm (7.3 KB)