Stokes subsystem not converged in 3D spherical shell model

optimux · March 5, 2026, 3:09am

Hello folks,

Hope you all good. I’m having trouble with the convergence of Stokes subsystem. The domain is a 3D spherical shell and the system has temperature (P1), Stokes (P2xP1), 5 compositions (P1). The model can go efficiently for the first 60 steps (each step requires a few nonlinear iterations), after that the residual of Stokes subsystem dominates the total residual and stalls near 1.0e-5 even the max nonlinear iteration =100. The following is the log:

*** Timestep 60:  t=3.35567e+09 years, dt=1e+08 years
   Solving temperature system... 30 iterations.
   Solving peridotite system ... 4 iterations.
   Solving U238 system ... 19 iterations.
   Solving U235 system ... 23 iterations.
   Solving Th232 system ... 16 iterations.
   Solving K40 system ... 22 iterations.
   Solving Stokes system (GMG)... 164+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 0.000187813, 1.55247e-10, 0.000243278, 0.010344, 2.46304e-05, 0.00319288, 3.682e-06
      Relative nonlinear residual (total system) after nonlinear iteration 1: 0.010344

   Solving temperature system... 27 iterations.
   Solving peridotite system ... 2 iterations.
   Solving U238 system ... 14 iterations.
   Solving U235 system ... 20 iterations.
   Solving Th232 system ... 10 iterations.
   Solving K40 system ... 18 iterations.
   Solving Stokes system (GMG)... 60+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 3.9103e-06, 1.32454e-11, 2.49716e-06, 0.000647025, 8.11117e-08, 0.000114895, 1.40779e-07
      Relative nonlinear residual (total system) after nonlinear iteration 2: 0.000647025

   Solving temperature system... 22 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 9 iterations.
   Solving U235 system ... 17 iterations.
   Solving Th232 system ... 4 iterations.
   Solving K40 system ... 14 iterations.
   Solving Stokes system (GMG)... 55+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.93589e-07, 5.94592e-13, 2.56315e-08, 4.04718e-05, 2.6778e-10, 4.13448e-06, 1.12013e-07
Relative nonlinear residual (total system) after nonlinear iteration 3: 4.04718e-05

   Solving temperature system... 16 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 4 iterations.
   Solving U235 system ... 14 iterations.
   Solving Th232 system ... 1 iterations.
   Solving K40 system ... 11 iterations.
   Solving Stokes system (GMG)... 50+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.0975e-08, 5.83545e-13, 2.63044e-10, 2.53153e-06, 1.87047e-12, 1.48778e-07, 1.12668e-07
      Relative nonlinear residual (total system) after nonlinear iteration 4: 2.53153e-06

   Solving temperature system... 11 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 1 iterations.
   Solving U235 system ... 11 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 7 iterations.
   Solving Stokes system (GMG)... 48+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.52306e-10, 5.84161e-13, 2.75681e-12, 1.58349e-07, 3.79287e-13, 5.35375e-09, 1.11534e-07
      Relative nonlinear residual (total system) after nonlinear iteration 5: 1.58349e-07


   Postprocessing:
   ...

Number of active cells: 1,057,096 (on 8 levels)
Number of degrees of freedom: 39,398,632 (30,075,738+1,331,842+1,331,842+1,331,842+1,331,842+1,331,842+1,331,842+1,331,842)

*** Timestep 61:  t=3.45567e+09 years, dt=1e+08 years
   Solving temperature system... 34 iterations.
   Solving peridotite system ... 7 iterations.
   Solving U238 system ... 18 iterations.
   Solving U235 system ... 22 iterations.
   Solving Th232 system ... 15 iterations.
   Solving K40 system ... 21 iterations.
   Solving Stokes system (GMG)... 203+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 0.00153151, 5.14947e-09, 0.00024315, 0.0103439, 2.45679e-05, 0.00319277, 0.000522223
      Relative nonlinear residual (total system) after nonlinear iteration 1: 0.0103439

   Solving temperature system... 28 iterations.
   Solving peridotite system ... 2 iterations.
   Solving U238 system ... 13 iterations.
   Solving U235 system ... 19 iterations.
   Solving Th232 system ... 9 iterations.
   Solving K40 system ... 17 iterations.
   Solving Stokes system (GMG)... 157+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 3.9239e-06, 1.46586e-11, 2.49577e-06, 0.000647015, 8.08506e-08, 0.000114891, 1.06373e-05
      Relative nonlinear residual (total system) after nonlinear iteration 2: 0.000647015

   Solving temperature system... 22 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 8 iterations.
   Solving U235 system ... 16 iterations.
   Solving Th232 system ... 3 iterations.
   Solving K40 system ... 13 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.9215e-07, 6.67613e-13, 2.5615e-08, 4.04712e-05, 2.65196e-10, 4.13433e-06, 1.06359e-05
      Relative nonlinear residual (total system) after nonlinear iteration 3: 4.04712e-05

   Solving temperature system... 17 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 3 iterations.
   Solving U235 system ... 13 iterations.
   Solving Th232 system ... 1 iterations.
   Solving K40 system ... 10 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.08477e-08, 6.53581e-13, 2.62766e-10, 2.5315e-06, 1.9697e-12, 1.48773e-07, 1.06343e-05
      Relative nonlinear residual (total system) after nonlinear iteration 4: 1.06343e-05

   Solving temperature system... 12 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 1 iterations.
   Solving U235 system ... 10 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 6 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.43424e-10, 6.54363e-13, 2.84011e-12, 1.58347e-07, 4.66064e-13, 5.35355e-09, 1.06349e-05
      Relative nonlinear residual (total system) after nonlinear iteration 5: 1.06349e-05

   Solving temperature system... 6 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 7 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 3 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 3.90573e-11, 6.54317e-13, 5.27704e-13, 9.90467e-09, 4.66409e-13, 1.92648e-10, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 6: 1.06347e-05

   Solving temperature system... 2 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 4 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 1 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 2.53068e-12, 6.5432e-13, 5.27697e-13, 6.19543e-10, 4.66384e-13, 6.96174e-12, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 7: 1.06347e-05

   Solving temperature system... 0 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 2 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 0 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.84763e-13, 6.54319e-13, 5.27697e-13, 3.87611e-11, 4.66388e-13, 9.9279e-13, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 8: 1.06347e-05

   Solving temperature system... 0 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 1 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 0 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.83705e-13, 6.54319e-13, 5.27697e-13, 2.5126e-12, 4.66386e-13, 9.9279e-13, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 9: 1.06347e-05

   Solving temperature system... 0 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 0 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 0 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.83795e-13, 6.54319e-13, 5.27697e-13, 3.44719e-13, 4.66387e-13, 9.92789e-13, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 10: 1.06347e-05

   Solving temperature system... 0 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 0 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 0 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.83795e-13, 6.54319e-13, 5.27697e-13, 3.44719e-13, 4.66386e-13, 9.9279e-13, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 11: 1.06347e-05

   ... THIS KEEPS GOING, SAME TOTAL RESIDUAL ...

At step 41, a refinement happens, which gives 820,000 active cells, 31 million DoF; At step 61, another refinement gives 1 million active cells, 39 million DoF. So the mesh becomes better, intuitively the Stokes should converge as before.

Part of the prm is also attached in case it’s useful:

set Additional shared libraries                               = ./src/libLaneuville_Material_Model.so, ./src/libLaneuville_Heating_Model.so, ./src/libLaneuville_Temperature_Boundary.so

set Dimension                                                 = 3
set Use years in output instead of seconds                    = true
set End time                                                  = 4.5e9
set Maximum time step                                         = 1.0e8
set Maximum first time step                                   = 1.0e3
set CFL number                                                = 0.5
set Maximum relative increase in time step                    = 50
set Output directory                                          = C0LB_3D_test1
set Resume computation                                        = false

set Nonlinear solver scheme                                   = iterated Advection and Stokes
set Max nonlinear iterations                                  = 20
set Nonlinear solver tolerance                                = 1e-6

subsection Solver parameters
  subsection Stokes solver parameters
    set Stokes solver type                                    = block GMG
    set Linear solver tolerance                               = 1e-9
    set Number of cheap Stokes solver steps                   = 300
    set GMRES solver restart length                           = 200
  end
end

subsection Discretization
  set Composition polynomial degree                           = 1
  set Temperature polynomial degree                           = 1
  set Use discontinuous composition discretization            = false

  subsection Stabilization parameters
    set Stabilization method                                  = entropy viscosity # SUPG
    set beta                                                  = 0.117 #0.078 for 2D, 0.117 for 3D
    set cR                                                    = 0.33
  end
end

subsection Formulation
  set Formulation = Boussinesq approximation
end

subsection Melt settings
  set Include melt transport                                  = false
end

subsection Compositional fields
  set Number of fields                                        = 5
  set Names of fields                                         = peridotite, U238, U235, Th232, K40
  set Types of fields                                         = chemical composition
  set Compositional field methods                             = field
end

subsection Initial composition model
  set Model name = function

  subsection Function
    set Coordinate system                                     = spherical
    set Variable names                                        = r, phi, theta
    set Function constants                                    = ...
    set Function expression                                   = ...
  end
end

subsection Boundary composition model
  set List of model names                                     = initial composition
end

subsection Material model
  set Material averaging                                      = geometric average only viscosity
  set Model name                                              = laneuville

  subsection Laneuville model
    set Chemical density difference                           = 60.0
    set Reference density                                     = 3400.0
    set Reference specific heat                               = 1000.0
    set Reference temperature                                 = 1600
    set Mantle thermal conductivity                           = 3.0
    set Crust thermal conductivity                            = 1.5
    set Reference thermal diffusivity                         = 1.0e-6
    set Thermal expansivity                                   = 2.0e-5
    set Activation energy                                     = 3.0e5
    set Reference viscosity                                   = 1.0e30 # so that we have a large constant viscosity
    set Max viscosity                                         = 1.0e28
    set Max melt fraction                                     = 0.3
    set Core density                                          = 7.4e3
    set Core specific heat                                    = 8.0e2
    
    set Physical radius of thermal conductivity interface     = 1.7e6
  end
end

subsection Geometry model
  set Model name                                              = spherical shell

  subsection Spherical shell
    set Inner radius                                          = 390000.0
    set Outer radius                                          = 1740000.0
    #set Cells along circumference                             = 64
    set Opening angle                                         = 360
  end
end

subsection Boundary velocity model
  set Zero velocity boundary indicators                       = 
  set Tangential velocity boundary indicators                 = top, bottom # top and bottom are slip-free
end

subsection Boundary temperature model
  set Fixed temperature boundary indicators                   = inner, outer #Dirichlet BC, prescribed values
  
  set Model name                                              = laneuville

  subsection Laneuville model
    set Initial bottom temperature                            = 2000.0
    set Initial top temperature                               = 250.0
  end
end

subsection Heating model
  set List of model names                                     = laneuville

  subsection Laneuville model
    set Latent heat                                           = 6.0e5
    set Phase change smooth factor                            = 1.0e-3
    set Partition coefficient of U238 (melt/mantle)           = 1.0
    set Partition coefficient of U235 (melt/mantle)           = 1.0
    set Partition coefficient of Th232 (melt/mantle)          = 1.0
    set Partition coefficient of K40 (melt/mantle)            = 1.0
  end
end

subsection Initial temperature model
  set Model name                                              = function

  subsection Function
    set Coordinate system                                     = spherical
    set Variable names                                        = r, phi, theta
    
    set Function constants  = ...

    set Function expression = ...
  end
end

subsection Gravity model
  set Model name                                              = radial constant
  
  subsection Radial constant
    set Magnitude                                             = 1.62
  end
end

subsection Mesh refinement
  set Initial global refinement                               = 5
  set Initial adaptive refinement                             = 2
  set Strategy                                                = temperature#, viscosity
  set Refinement fraction                                     = 0.2
  set Coarsening fraction                                     = 0.1
  set Time steps between mesh refinement                      = 20
end

subsection Postprocess
... 
end

subsection Nullspace removal
  set Remove nullspace                                        = net rotation
end

subsection Checkpointing
  set Steps between checkpoint                                = 50
end

I also tried AMG which requires more memory. When initial global refinement=5 and initial adaptive refinement=2, 512 GB of memory is used up. AMG also struggles after some 20 steps due to the stagnation of Stokes residual. So my question is: is 1.0e-5 acceptable for such a 3D model? How to improve the convergence rate? is AMG still a backup? Please note that the viscosity over the domain is almost a large constant for this test (Reference viscosity = 1.0e30), I guess the problem would becomes much worse if large contrast of viscosity is present (by setting Reference viscosity = 1.0e21 as I intend to do so).

Any ideas or comments are appreciated, thanks!

MM

optimux · March 5, 2026, 4:50am

Update to this thread:

I set steps between checkpoint=50, so when I resume from step 50, the convergence after Timestep = 61 (including 61) are almost the same as preceding steps:

*** Timestep 61:  t=3.45567e+09 years, dt=1e+08 years
   Solving temperature system... 35 iterations.
   Solving peridotite system ... 8 iterations.
   Solving U238 system ... 18 iterations.
   Solving U235 system ... 22 iterations.
   Solving Th232 system ... 15 iterations.
   Solving K40 system ... 21 iterations.
   Solving Stokes system (GMG)... 134+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes sy
stem): 0.00215462, 6.24459e-09, 0.000243117, 0.0103438, 2.45509e-05, 0.00319275,
 0.00073863
      Relative nonlinear residual (total system) after nonlinear iteration 1: 0.
0103438

   Solving temperature system... 28 iterations.
   Solving peridotite system ... 2 iterations.
   Solving U238 system ... 13 iterations.
   Solving U235 system ... 19 iterations.
   Solving Th232 system ... 9 iterations.
   Solving K40 system ... 17 iterations.
   Solving Stokes system (GMG)... 59+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes sy
stem): 4.70636e-06, 1.64717e-11, 2.49542e-06, 0.000647014, 8.07835e-08, 0.000114
89, 8.10991e-07
      Relative nonlinear residual (total system) after nonlinear iteration 2: 0.
000647014

   Solving temperature system... 23 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 8 iterations.
   Solving U235 system ... 16 iterations.
   Solving Th232 system ... 3 iterations.
   Solving K40 system ... 13 iterations.
   Solving Stokes system (GMG)... 55+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes sy
stem): 2.30474e-07, 7.56633e-13, 2.56106e-08, 4.04711e-05, 2.64653e-10, 4.1343e-
06, 8.05186e-07
      Relative nonlinear residual (total system) after nonlinear iteration 3: 4.04711e-05

   Solving temperature system... 17 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 3 iterations.
   Solving U235 system ... 13 iterations.
   Solving Th232 system ... 1 iterations.
   Solving K40 system ... 10 iterations.
   Solving Stokes system (GMG)... 53+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.30114e-08, 7.38509e-13, 2.62679e-10, 2.53149e-06, 2.11534e-12, 1.48772e-07, 8.05087e-07
      Relative nonlinear residual (total system) after nonlinear iteration 4: 2.53149e-06

   Solving temperature system... 12 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 1 iterations.
   Solving U235 system ... 10 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 6 iterations.
   Solving Stokes system (GMG)... 52+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 7.7176e-10, 7.39519e-13, 2.83527e-12, 1.58346e-07, 4.85132e-13, 5.3535e-09, 8.04331e-07
      Relative nonlinear residual (total system) after nonlinear iteration 5: 8.04331e-07


   Postprocessing:
     Writing graphical output:           C0LB_3D_test0/solution/solution-00045
     RMS, max velocity:                  1.69e-09 m/year, 5.55e-09 m/year
     Heat fluxes through boundary parts: -5.621e+09 W, 3.638e+11 W
     CMB Heat flow nearside/farside:     -2762035879.009386/-2858901060.727083
     Writing depth average:              C0LB_3D_test0/depth_average
     Compositions min/max/mass:          -0.5036/-0.08768/-3.315e+18 // 7.284e-09/2.004e-06/5.423e+11 // 1.423e-10/3.918e-08/1.06e+10 // 2.423e-08/6.669e-06/1.804e+12 // 3.458e-09/9.518e-07/2.575e+11

*** Timestep 62:  t=3.55567e+09 years, dt=1e+08 years
   Solving temperature system... 34 iterations.
   Solving peridotite system ... 7 iterations.
   Solving U238 system ... 18 iterations.
   Solving U235 system ... 22 iterations.
   Solving Th232 system ... 15 iterations.
   Solving K40 system ... 21 iterations.
   Solving Stokes system (GMG)... 108+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 0.00190894, 2.1589e-09, 0.000243203, 0.0103439, 2.45778e-05, 0.00319284, 0.000746974
      Relative nonlinear residual (total system) after nonlinear iteration 1: 0.0103439

   Solving temperature system... 28 iterations.
   Solving peridotite system ... 2 iterations.
   Solving U238 system ... 13 iterations.
   Solving U235 system ... 19 iterations.
   Solving Th232 system ... 9 iterations.
   Solving K40 system ... 17 iterations.
   Solving Stokes system (GMG)... 59+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 4.42256e-06, 1.59122e-11, 2.49631e-06, 0.00064702, 8.08653e-08, 0.000114894, 8.10349e-07
      Relative nonlinear residual (total system) after nonlinear iteration 2: 0.00064702

   Solving temperature system... 23 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 8 iterations.
   Solving U235 system ... 16 iterations.
   Solving Th232 system ... 3 iterations.
   Solving K40 system ... 13 iterations.
   Solving Stokes system (GMG)... 55+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 2.14078e-07, 7.18689e-13, 2.562e-08, 4.04715e-05, 2.64552e-10, 4.13442e-06, 8.05264e-07
      Relative nonlinear residual (total system) after nonlinear iteration 3: 4.04715e-05

   Solving temperature system... 17 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 3 iterations.
   Solving U235 system ... 13 iterations.
   Solving Th232 system ... 1 iterations.
   Solving K40 system ... 10 iterations.
   Solving Stokes system (GMG)... 53+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.20324e-08, 7.05529e-13, 2.6279e-10, 2.53151e-06, 2.00663e-12, 1.48776e-07, 8.05113e-07
      Relative nonlinear residual (total system) after nonlinear iteration 4: 2.53151e-06

   Solving temperature system... 12 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 1 iterations.
   Solving U235 system ... 10 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 6 iterations.
   Solving Stokes system (GMG)... 52+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 7.12177e-10, 7.06256e-13, 2.83693e-12, 1.58348e-07, 4.65912e-13, 5.35366e-09, 8.04349e-07
      Relative nonlinear residual (total system) after nonlinear iteration 5: 8.04349e-07


   Postprocessing:
...

I’m pretty happy that it’s converged, but I’m really concerned about why. Maybe this is the way to improve the convergence rate?

jbnaliboff · March 9, 2026, 9:54pm

@optimux - Thank you for posting the question to the forum and apologies that it took a bit longer than normal to reply.

Yes, that is a bit concerning. To confirm, what you are seeing is that the convergence behavior is different for the same time steps (or points in time) depending on whether or not you restart from a check point midway through the solution?

If this indeed the case, can you identify from the solution itself how the models are diverging?

I looked through the PRM file, but hesitant to do too much analysis as the model is using a custom material (i.e., it would be helpful to know more about this first).

However, here are a few more general comments and questions about the model setup:

What was the motivation for reducing Composition polynomial degree and Temperature polynomial degree from 2 to 1?
Likewise what was the motivation for changing the SUPG stabilization parameters (beta, cR) from there defaults?
Are you doing averaging of properties between fields in the material model?
Does convergence behavior improve if you reduce the CFL value?

Cheers,

John

optimux · March 10, 2026, 11:16am

Hi @jbnaliboff ,
Thanks for your quick reply. Since I have moved forward a bit, the “60 step convergence” case has been deleted and I’m afraid that is not a reliable way to achieve convergence.

To answer your questions. 1. They used to be 2 but it slows down the code a bit (maybe uses more memory as well, I forgot to see it). So basically I just want to speedup. 2. They are actually default values. 3. In my plugin, I guess not; in prm, I use “geometric average only viscosity”.

For question 4, after many attempts, I found the code is sensitive to the mesh. If mesh refinement happens, the code can hardly resolve whatever small dt or large tolerance used. For example, before refinement I have 98 million DoFs, 2 more levels of refinement would decrease it to 24 million DoFs (active cells drops a lot too). Now keep the mesh unchanged, large tolerance (less strict) would of course allow large dt, but dt also has an upper limit. What I’m struggling with is to find a relative large dt with tolerance as small as possible (i.e., to speedup but better solution). The machine is 4 years old, 64 cores (some 2.2 GHz), memory bandwidth is not large I guess. If dt=1.0e6 years, the total time 4.56 billion years requires 4600 steps, each step has hundreds of iterations. This model cannot be done in a couple of days. The model is actually from Laneuville2013 paper(Asymmetric thermal evolution of the Moon), they used GAIA, a finite volume code. I cannot believe that using the most advanced FEM library, solver backends and hardware, I still cannot model something that has been published 14 years ago. There must be something missing in my prm or something I don’t understand in tuning ASPECT. I can paste all plugins and prm if you’d like to take a look. Some comments in the plugins are outdated and could be misleading! Really appreciate your help!

Regards,
MM
laneuville_heat_flow.cc.txt (5.7 KB)
laneuville_heat_flow.h.txt (26.1 KB)
laneuville_heating_model.cc.txt (8.4 KB)
laneuville_heating_model.h.txt (2.8 KB)
laneuville_material_model.cc.txt (12.8 KB)
laneuville_material_model.h.txt (4.7 KB)
laneuville_shared_state.cc.txt (284 Bytes)
laneuville_shared_state.h.txt (511 Bytes)
laneuville_temperature_boundary.cc.txt (3.6 KB)
laneuville_temperature_boundary.h.txt (2.7 KB)
laneuville2013.prm (12.9 KB)

optimux · March 11, 2026, 8:23am

Hi @jbnaliboff ,
Thanks for your quick reply. Since I have moved forward a bit, the “60 step convergence” case has been deleted and I’m afraid that is not a reliable way to achieve convergence.

To answer your questions. 1. They used to be 2 but it slows down the code a bit (maybe uses more memory as well, I forgot to see it). So basically I just want to speedup. 2. They are actually default values. 3. In my plugin, I guess not; in prm, I use “geometric average only viscosity”.

For question 4, after many attempts, I found the code is sensitive to the mesh. If mesh refinement happens, the code can hardly resolve whatever small dt or large tolerance used. For example, before refinement I have 98 million DoFs, 2 more levels of refinement would decrease it to 24 million DoFs (active cells drops a lot too). Now keep the mesh unchanged, large tolerance (less strict) would of course allow large dt, but dt also has an upper limit. What I’m struggling with is to find a relative large dt with tolerance as small as possible (i.e., to speedup but better solution). The machine is 4 years old, 64 cores (some 2.2 GHz), memory bandwidth is not large I guess. If dt=1.0e6 years, the total time 4.56 billion years requires 4600 steps, each step has hundreds of iterations. This model cannot be done in a couple of days. The model is actually from Laneuville2013 paper(Asymmetric thermal evolution of the Moon), they used GAIA, a finite volume code. I cannot believe that using the most advanced FEM library, solver backends and hardware, I still cannot model something that has been published 14 years ago. There must be something missing in my prm or something I don’t understand in tuning ASPECT. I can paste all plugins and prm if you’d like to take a look. Some comments in the plugins are outdated and could be misleading! Really appreciate your help!

Regards,
MM
laneuville_heat_flow.cc.txt (5.7 KB)
laneuville_heat_flow.h.txt (26.1 KB)
laneuville_heating_model.cc.txt (8.4 KB)
laneuville_heating_model.h.txt (2.8 KB)
laneuville_material_model.cc.txt (12.8 KB)
laneuville_material_model.h.txt (4.7 KB)
laneuville_shared_state.cc.txt (284 Bytes)
laneuville_shared_state.h.txt (511 Bytes)
laneuville_temperature_boundary.cc.txt (3.6 KB)
laneuville_temperature_boundary.h.txt (2.7 KB)
laneuville2013.prm (12.9 KB)

One missing file (cmake):
CMakeLists.txt (783 Bytes)

Additional info found. Given Initial global refinement = 5, if the solver fails to converge (due to large dt or small convergence tolerance or whatsoever), it always complains:

<xx/dealii-9.6.2/include/deal.II/lac/solver_cg.h>
in function
void dealii::SolverCG::solve(const MatrixType&,
VectorType&, const VectorType&, const PreconditionerType&) [with
MatrixType = aspect::MatrixFreeStokesOperators::MassMatrixOperator<3,
1, double>; PreconditionerType = dealii::PreconditionMG<3,
dealii::LinearAlgebra::distributed::Vector,
dealii::MGTransferMF<3, double> >; VectorType =
dealii::LinearAlgebra::distributed::Vector]
The violated condition was:
solver_state == SolverControl::success
Additional information:
Iterative method reported convergence failure in step 100. The
residual in the last step was 1.36073e-06.

I found that 100 steps is hardcoded in constructor
explicit SolverControl(const unsigned int n = 100,
const double tol = 1.e-10,
const bool log_history = false,
const bool log_result = true);
in directory deal.ii/include/deal.II/lac/solver_control.h. This means I cannot change it, otherwise I have to recompile ASPECT. I don’t understand why the max number of iteration is set 100, is 100 iterations large enough? I have seen this 100-step-failure many time in my tests. I’m still unaware of what to do to improve the convergence and speed. Any ideas?

Thanks,
MM

maxrudolph · March 11, 2026, 3:45pm

I’ve been following this topic because we have had some convergence problems with ASPECT on adaptively refined meshes as well. I can offer a few suggestions.

Using higher order elements in general should give you much improved accuracy for a given cell size. The use of non-standard elements also means that you are working with features of the code that are less extensively tested and in my experience this means that you’re more likely to encounter problems.
In Laneuville et al. (2013), the rheology is Newtonian. Is your model non-Newtonian? If so, this is a major difference that will significantly increase the computational cost. I’m not familiar with the solver design in GAIA, but it may very well not have any nonlinear iteration if the Stokes and energy equations are solved alternately at each timestep, as is done in many other geodynamic modeling codes. This is something that you can try in ASPECT as well (it is the default nonlinear solver scheme - ‘single Advection, single Stokes’).
If you are dealing with very strongly temperature dependent viscosity and seeing convergence problems, you can try: (a) do harmonic averaging of viscosity and (b) increase the GMRES solver restart length.
Using particles for the compositional field in ASPECT might be a good idea because they preserve sharp interfaces and are not susceptible to numerical diffusion or over/under-shoots associated with solving an advection equation for composition.
I’m not sure how many cores were used in Laneuville et al. 2013, but most 3D models of that era were run on clusters, not on a single node. For 3D regional models, we typically use ~500 cores as a starting point, keeping around 100,000-200,000 degrees of freedom per core.
Default Arguments in C++ - GeeksforGeeks

optimux · March 12, 2026, 1:42am

Hi Max,
Thanks for your suggestions.

Yes absolutely. I used polynomial of degree 1 for compositions and temperature because during the test the memory exploded if the polynomial of degree is 2 and initial global refinement=5, initial adaptive refinement=2. So I was trying to reduce the memory usage by using polynomial of degree 1. Later I found that mesh refinement (whatever initial adaptive or runtime refinement) would decrease the number of DoFs and active cells dramatically (a couple of time less) which leads to solver failure easily. So now I’m using no initial adaptive refinement or runtime refinement, so far using polynomial of degree 2 looks good, memory is under control. I did find some entries for tuning mesh refinement (e.g., refinement fraction), but that requires some time and experience to test.
I followed their EXACT equation, \eta(T) = \exp(\frac{E}{T+T_{surf}} - \frac{E}{T+T_0}) where E is activation energy, T is temperature, T_{surf} and T_0 are surface temperature and reference temperature respectively. For the temperature range in their paper, the viscosity covers (3 \times 10^{19}) \sim 10^{28}, some 8 orders of magnitude, and this large span almost happens near mantle-crust interface, some 40 km below the surface. Additionally I also tried “single Advection, single Stokes” but it did not help much.
Yes, harmonic/geometric averaging of viscosity and GMRES solver length have been tested. I even tried " project to Q1 only viscosity", but not helpful.
The authors used particles for compositions. However, the convergence problem always comes from Stokes subsystem, compositions converge pretty fast. I also tried to unify all compositions to one nominal composition, but later I verified that all compositions should be solved because of their independency.
That’s a bit concerning. It would be a nightmare to have thousands of cores running for months! The problem is that when you increase the size of DoFs, you use a lot of memory, the memory bandwidth and infinite network bandwidth are too slow to feed CPUs, so yes you have a large problem and you would produce fine internal structure, but each core is running slowly. This morning following your suggestions, I’m using polynomial of degree 2 for composition and temperature, the memory usage is pretty stable, some 370 GB, 230 million of DoFs on 64 cores.

Let me finish the tests of polynomial of degree 2 first and I’ll get back to you.

Cheers,
MM

optimux · March 12, 2026, 1:12pm

Hello,

It looks like polynomial degree of 2 didn’t help. The solver SolverCG failed to converge in step 100. I’m running out of ideas Any ideas?
Thanks,
MM

tjhei · March 12, 2026, 4:07pm

Did you post a screen output? Sorry, if I missed that.

Also, could you turn on “output details” Solver parameters — ASPECT 3.1.0-pre of the Stokes solver? That would print information to the screen, that might help diagnose the problem.

jbnaliboff · March 12, 2026, 4:28pm

@optimux - A brief follow-up.

Fist, I second all of the suggestions from @maxrudolph regarding models adjustments and avenues for testing.

In one of your last posts, you noted that:

For the temperature range in their paper, the viscosity covers (3 \times 10^{19}) \sim 10^{28}, some 8 orders of magnitude

I think this is likely the crux of the issue in that 8 orders of magnitude viscosity contrast is right at the very limit of what ASPECT (or most other codes can handle). In the prior publication, did the authors mention what the min/max viscosity bounds are?

Regardless, I would try systematically reducing the max viscosity to 1e27 and 1e26 Pa s to see how both solver convergence and the model results vary. I have found that both GMG and AMG perform quite well for problems with a nonlinear VEP rheology and up to six order of magnitude viscosity contrast and only using the cheap stokes solves.

I have additional thoughts on ways to proceed, but I think the issue above should be discussed first before moving onto other steps.

optimux · March 13, 2026, 10:42am

Hi @jbnaliboff, @tjhei

Thanks for your quick reply! I have tried polynomial degree of 2 for temperature and compositions, as @maxrudolph suggested. There is little difference between degree 1 and 2. This time I also turned on “output details” as @tjhei suggested. The full log and solver_history are uploaded for detailed analysis. For cases similar to this one, the CG solver usually failed around 13, 14 time step.

Laneuville’s paper did set an upper limit of viscosity, 1.0e28 Pa s, but no lower limit is declared. As you can see in log file, the viscosity range covers 3e19-1.0e28 Pa s. I have tested constant viscosity, 1.0e28, and small viscosity change (100 times), they look good.

I got an idea which is borrowed from other dynamic modeling, that is, start with small viscosity contrast, after some time, restart the model with larger viscosity contrast, keep doing this until the mentioned viscosity contrast is reached. Does this sound like cheating or is there better way to do this?

Deeply appreciate your help!
MM
solver_history.txt (3.0 KB)
slurm-4456930.out.txt (46.4 KB)

optimux · March 16, 2026, 6:29am

Hello,

I found a pretty old paper on the test of large viscosity contrast up to 10^{14} using CITCOM: Numerical investigation of 2D convection with extremely large viscosity variations (Moresi and Solomatov 1995). The paper offered detailed description on the algorithm and preconditioning. I’m wondering if we could implement this in ASPECT?

Best,
MM

jbnaliboff · March 16, 2026, 7:54pm

@optimux - The viscosity contrasts that are permissible depend both on the accuracy you want to achieve, the type of solvers being used, etc.

I am not aware of any modern geodynamics code that can accurately go beyond 6-8 orders of magnitude viscosity contrast for a complex constitutive equation, sharp spatial variations in material properties, significant time-dependent deformation, and iterative linear solvers.

Again, I may be wrong on this (others, please chime in), but my sense is that to first order ASPECT’s limits in terms of viscosity contrasts are the limits of other similar numerical codes.

My suggestion is to start with a smaller viscosity range (4x, 6x) and then examine how this affects the model dynamics, solver convergence behavior, and outcomes relative to the prior study.

John

bangerth · March 16, 2026, 8:25pm

@optimux Perhaps also to augment what @jbnaliboff just said: To first order, those parts of the material with viscosity Y move slower by a factor of Y/X than material with a viscosity X. So if you think of geologic settings where the maximal velocities are on the order of, say, 10 cm/year, then if you have a viscosity range of six orders of magnitude, you should expect velocities on the order of 0.00001 cm/year where the viscosity is large. Of course, if you had a viscosity range of ten orders of magnitude, it would be 0.0000001 cm/year – and you get the idea: At these velocities, it really doesn’t matter which value you get: They will be geologically indistinguishable. As a consequence, it is rarely necessary in simulations to actually have viscosity ratios that are much larger than, say, six orders of magnitude.

(The argument isn’t quite right because the viscosity doesn’t affect the velocity, but instead the strain rate. But you get the general gist of the argument.)

Best
W.

optimux · March 17, 2026, 2:07am

Hi @bangerth, thanks for this useful and practical explanation. Now the problem becomes to which extent do you believe, or how do you choose that the large viscosity layer behaves like a layer of realistic large viscosity? I have always been struggling with the trade off between the math (theoretical constraints) and the practice (loose constraints but be more practical). Next I would try smaller upper limit viscosity and I’ll keep you posted. Thanks!

Cheers,
MM

optimux · March 17, 2026, 2:24am

Hello @jbnaliboff, thanks for your reply. What you described from your experience is REALLY important, which builds up my confidence to use smaller viscosity contrast. And I do agree that this is battle among mesh, dt, tolerance and material property variation(viscosity), based on my preliminary tests. Like @bangerth suggested, I would run some cases with smaller viscosity contrast. Let’s find out!

Cheers,
MM

bangerth · March 17, 2026, 7:53pm

@optimux What I was trying to say is basically this: If you have a weak layer with a viscosity of 10^18, then it will not really matter whether you give your crust a viscosity of 10^24 or 10^25 or 10^26. It is always useful to play around with different settings and observe what changes if only because that gives you experience of what each setting affects, but it’s not likely that there will be substantial differences between the three values – the crust will for all practical purposes look like a rigid one because the internal deformations will be so small.

Best
WB

optimux · March 20, 2026, 8:05am

Hi @bangerth @jbnaliboff thanks for being so patient. Now I can tell what practical solution means. The convergence is much better now. But I got another problem which requires deep understanding of ASPECT source code and framework. I guess I need to have another post.

Cheers,
MM

Topic		Replies	Views
Question on setting up Stokes solver parameters using Aspect 3.1.0-pre, solver fails to converge ASPECT	6	163	March 16, 2026
Re: Aspect question ASPECT	5	496	January 3, 2019
Question regarding buckling model ASPECT	11	398	January 23, 2024
Problems in session 3 of CIG ASPECT tutorials ASPECT	9	338	July 24, 2024
Problem running continental extension cookbook on recently installed version of ASPECT ASPECT	12	445	July 3, 2024

Stokes subsystem not converged in 3D spherical shell model

Related topics