Stability problem when solving for temperature and composition

FELIPE · November 1, 2019, 2:35am

Hi all,

  I am running a 3D model with viscosity layering, that in addition has compositional fields, and a phase change.
  I successfully run a long simulation for days. Now in another model, I made the density contrast of the compositional field smaller, the phase transition width thinner, and the thermal viscosity exponent non-zero. This model crashes early on step 22.. I have tried not reducing the density contrast and the phase change width too much, and not to increase the T Visc exponent too much,. but it crashes anyway, always on step 22..

  *** Timestep 22:  t=2.61817e+14 seconds

Solving temperature system… 14 iterations.
Solving C_1 system … ---------------------------------------------------------
TimerOutput objects finalize timed values printed to the
screen by communicating over MPI in their destructors.
Since an exception is currently uncaught, this
synchronization (and subsequent output) will be skipped to
avoid a possible deadlock.

I am using these:

subsection Mesh refinement

set Refinement fraction = 0.05

set Coarsening fraction = 0.07

set Initial global refinement = 4
set Initial adaptive refinement = 3
set Strategy = strain rate, temperature
set Refinement criteria merge operation = plus

I am also using the following tolerances, which I already changed a bit to try to make it to work:

  subsection Solver parameters

set Temperature solver tolerance = 1e-12

  subsection Stokes solver parameters
        set Linear solver tolerance                = 1e-7
  end

end

what should I do?, should I decrease these tolerances to make the solvers to achieve smaller errors and be able to handle the more stringent parameter combinations?

cheers,
Felipe

bangerth · November 2, 2019, 3:04am

Felipe – did the output contain any other information about the error? For
example, that the compositional solver did not converge?

I was fooled for a minute by thinking that the time of 2.7e14 seconds looks
suspiciously large. A year has about 3e7 seconds, so your time is really just
only around 1e7 years. You can avoid this sort of question by setting the
parameter in the input file that makes sure that times are shown in years
instead of seconds.

Best
W.

FELIPE · November 2, 2019, 4:50am

Hi Wolfgang !

 Thanks for replying.

Part 1) I have run many successful models of this kind. What seems to me is that the latent heat means a lot of effort for Aspect solver…phase change seems to be a complex ingredient (volume changes, heat transfers) for solving.
Yes, these models take a looooooong geologic time to respond (even without latent heat, e.g. viscosity is 10^22 Pa.s in the lower mantle in this model). The thermal instabilities (plume) trigger convection showing up at about 0.2 Giga years… very long time… I think the onset time of the instabilities, which may be physico-mathematically correct (Aspect works well), are unrealistic to Earth, too large probably due to homogeneity of the material. In the real Earth where heterogeneities and even vibrations might help out the triggering and onset of convection.

 I hope not to be misinterpreting things here.

Part 2) The error message is redundant, here some extracts:

*** Timestep 22: t=3.68481e+14 seconds
Solving temperature system… 24 iterations.
Solving C_1 system … ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

TimerOutput objects finalize timed values printed to the
screen by communicating over MPI in their destructors.
Since an exception is currently uncaught, this
synchronization (and subsequent output) will be skipped to
avoid a possible deadlock.

ERROR: Uncaught exception in MPI_InitFinalize on proc 937. Skipping MPI_Finalize() to avoid a deadlock.

An error occurred in line <493> of file </work/04020/unfelipe/software/candi/aspect/source/simulator/solver.cc> in function
double aspect::Simulator::solve_advection(const aspect::Simulator::AdvectionField&) [with int dim = 3]
The violated condition was:
false
Additional information:
The iterative advection solver did not converge. It reported the following error:

An error occurred in line <1055> of file </work/04020/unfelipe/stampede2/software/candi/install/deal.II-v9.0.0/include/deal.II/lac/solver_gmres.h> in function
void dealii::SolverGMRES::solve(const MatrixType&, VectorType&, const VectorType&, const PreconditionerType&) [with MatrixType = dealii::TrilinosWrappers::SparseMatrix; PreconditionerType = dealii::TrilinosWrappers::PreconditionILU; VectorType = dealii::TrilinosWrappers::MPI::Vector]
The violated condition was:
iteration_state == SolverControl::success
Additional information:
Iterative method reported convergence failure in step 1000. The residual in the last step was 3.68095e+12.

This error message can indicate that you have simply not allowed a sufficiently large number of iterations for your iterative solver to converge. This often happens when you increase the size of your problem. In such cases, the last residual will likely still be very small, and you can make the error go away by increasing the allowed number of iterations when setting up the SolverControl object that determines the maximal number of iterations you allow.
The other situation where this error may occur is when your matrix is not invertible (e.g., your matrix has a null-space), or if you try to apply the wrong solver to a matrix (e.g., using CG for a matrix that is not symmetric or not positive definite). In these cases, the residual in the last iteration is likely going to be large.

Aborting!

ERROR: Uncaught exception in MPI_InitFinalize on proc 868. Skipping MPI_Finalize() to avoid a deadlock.

cheers,
Felipe

FELIPE · November 2, 2019, 6:12pm

Hello Wolfgang, or to whom it may concern,

    Running more test simulations, I realize that it seems Aspect has trouble with the temperature-dependent viscosity (say, the exponent beta increasing away from 0 demands more from the solver). Additionally, making the width of my phase transition smaller than ~35 km also makes the model crash..

My adaptive criterion is plus, temperature and strain.

 What should I do with the solver tolerances?

 Should I initially refine the mesh more ?

Obviously these are universal questions whose answer is guessable…almost easy, but I would appreciate your opinion and experience.

cheers,
Felipe

bangerth · November 3, 2019, 11:06pm

Felipe,

What should I do with the solver tolerances? Should I initially refine the
mesh more ? |

At the end of the day, you have a linear system with a condition number so
large that the solvers can’t find a solution.

Obviously these are universal questions whose answer is guessable…almost easy,
but I would appreciate your opinion and experience.

Your options are:

Accept that you can’t find a solution as accurate as the tolerance you are
asking. Use a larger (less stringent) tolerance. This gives you some answer,
but it may not be physical of course.
Allow for more iterations. This might yield the solution you are looking
for, at the expense of more CPU time.
Modify the model you have: For example, limit the viscosity from above and
below so that the variation is not as large. Alternatively, make sure it
doesn’t vary as rapidly as it may be in your model.
Resolve the mesh better so that the variation you have doesn’t happen from
one cell to the next, but over a number of cells.

None of these options are good, of course. They reflect a balance between what
is possible and what compromises you’re willing to make.

Best
W.

jdannberg · November 5, 2019, 6:07pm

Hi Felipe,

the model crashes when you solve for the composition (not temperature or Stokes). That means you have a solution for the velocity that allows you to solve the temperature equation, but not the composition equation, which I find quite surprising. So I would have a look at the output right before the model crashes to see if anything looks weird with your compositional field (jumps/oscillations/boundary conditions/etc.)

A few more things to look out for:

if you have latent heat being released at your phase transitions, make sure the phase transition is well-resolved, at least 4 mesh cells across the phase transition width. Otherwise the temperature changes are not computed accurately (and results may be very wrong for very coarse resolutions, which might cause the solver to crash).
you probably want to refine your mesh where the phase transition is, which may not necessarily be where the strain rate/temperature jumps. So make sure you have the highest refinement level where your phase transition is, either by adding the viscosity refinement criteria (if you have a viscosity jump at the transition), or by using the minimum refinement function.

Best,
Juliane

Topic		Replies	Views
Issues with ASPECT model for subduction initiation ASPECT	22	171	April 30, 2025
Viscoplastic material model with compositional fields not switching ASPECT material-model	2	786	January 18, 2021
Re: Aspect question ASPECT	5	413	January 3, 2019
Convergence error during simulation of plate thickening process ASPECT	4	50	February 17, 2025
2D subduction model ASPECT 2.5.0 vs. 3.0.0 + convergence issues ASPECT	4	49	April 1, 2025

Stability problem when solving for temperature and composition

set Refinement fraction = 0.05

set Coarsening fraction = 0.07

*** Timestep 22: t=3.68481e+14 seconds Solving temperature system… 24 iterations. Solving C_1 system … ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

TimerOutput objects finalize timed values printed to the screen by communicating over MPI in their destructors. Since an exception is currently uncaught, this synchronization (and subsequent output) will be skipped to avoid a possible deadlock.

Aborting!

Related topics

*** Timestep 22: t=3.68481e+14 seconds
Solving temperature system… 24 iterations.
Solving C_1 system … ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

TimerOutput objects finalize timed values printed to the
screen by communicating over MPI in their destructors.
Since an exception is currently uncaught, this
synchronization (and subsequent output) will be skipped to
avoid a possible deadlock.