Stokes subsystem not converged in 3D spherical shell model

Hello folks,

Hope you all good. I’m having trouble with the convergence of Stokes subsystem. The domain is a 3D spherical shell and the system has temperature (P1), Stokes (P2xP1), 5 compositions (P1). The model can go efficiently for the first 60 steps (each step requires a few nonlinear iterations), after that the residual of Stokes subsystem dominates the total residual and stalls near 1.0e-5 even the max nonlinear iteration =100. The following is the log:

*** Timestep 60:  t=3.35567e+09 years, dt=1e+08 years
   Solving temperature system... 30 iterations.
   Solving peridotite system ... 4 iterations.
   Solving U238 system ... 19 iterations.
   Solving U235 system ... 23 iterations.
   Solving Th232 system ... 16 iterations.
   Solving K40 system ... 22 iterations.
   Solving Stokes system (GMG)... 164+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 0.000187813, 1.55247e-10, 0.000243278, 0.010344, 2.46304e-05, 0.00319288, 3.682e-06
      Relative nonlinear residual (total system) after nonlinear iteration 1: 0.010344

   Solving temperature system... 27 iterations.
   Solving peridotite system ... 2 iterations.
   Solving U238 system ... 14 iterations.
   Solving U235 system ... 20 iterations.
   Solving Th232 system ... 10 iterations.
   Solving K40 system ... 18 iterations.
   Solving Stokes system (GMG)... 60+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 3.9103e-06, 1.32454e-11, 2.49716e-06, 0.000647025, 8.11117e-08, 0.000114895, 1.40779e-07
      Relative nonlinear residual (total system) after nonlinear iteration 2: 0.000647025

   Solving temperature system... 22 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 9 iterations.
   Solving U235 system ... 17 iterations.
   Solving Th232 system ... 4 iterations.
   Solving K40 system ... 14 iterations.
   Solving Stokes system (GMG)... 55+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.93589e-07, 5.94592e-13, 2.56315e-08, 4.04718e-05, 2.6778e-10, 4.13448e-06, 1.12013e-07
Relative nonlinear residual (total system) after nonlinear iteration 3: 4.04718e-05

   Solving temperature system... 16 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 4 iterations.
   Solving U235 system ... 14 iterations.
   Solving Th232 system ... 1 iterations.
   Solving K40 system ... 11 iterations.
   Solving Stokes system (GMG)... 50+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.0975e-08, 5.83545e-13, 2.63044e-10, 2.53153e-06, 1.87047e-12, 1.48778e-07, 1.12668e-07
      Relative nonlinear residual (total system) after nonlinear iteration 4: 2.53153e-06

   Solving temperature system... 11 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 1 iterations.
   Solving U235 system ... 11 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 7 iterations.
   Solving Stokes system (GMG)... 48+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.52306e-10, 5.84161e-13, 2.75681e-12, 1.58349e-07, 3.79287e-13, 5.35375e-09, 1.11534e-07
      Relative nonlinear residual (total system) after nonlinear iteration 5: 1.58349e-07


   Postprocessing:
   ...

Number of active cells: 1,057,096 (on 8 levels)
Number of degrees of freedom: 39,398,632 (30,075,738+1,331,842+1,331,842+1,331,842+1,331,842+1,331,842+1,331,842+1,331,842)

*** Timestep 61:  t=3.45567e+09 years, dt=1e+08 years
   Solving temperature system... 34 iterations.
   Solving peridotite system ... 7 iterations.
   Solving U238 system ... 18 iterations.
   Solving U235 system ... 22 iterations.
   Solving Th232 system ... 15 iterations.
   Solving K40 system ... 21 iterations.
   Solving Stokes system (GMG)... 203+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 0.00153151, 5.14947e-09, 0.00024315, 0.0103439, 2.45679e-05, 0.00319277, 0.000522223
      Relative nonlinear residual (total system) after nonlinear iteration 1: 0.0103439

   Solving temperature system... 28 iterations.
   Solving peridotite system ... 2 iterations.
   Solving U238 system ... 13 iterations.
   Solving U235 system ... 19 iterations.
   Solving Th232 system ... 9 iterations.
   Solving K40 system ... 17 iterations.
   Solving Stokes system (GMG)... 157+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 3.9239e-06, 1.46586e-11, 2.49577e-06, 0.000647015, 8.08506e-08, 0.000114891, 1.06373e-05
      Relative nonlinear residual (total system) after nonlinear iteration 2: 0.000647015

   Solving temperature system... 22 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 8 iterations.
   Solving U235 system ... 16 iterations.
   Solving Th232 system ... 3 iterations.
   Solving K40 system ... 13 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.9215e-07, 6.67613e-13, 2.5615e-08, 4.04712e-05, 2.65196e-10, 4.13433e-06, 1.06359e-05
      Relative nonlinear residual (total system) after nonlinear iteration 3: 4.04712e-05

   Solving temperature system... 17 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 3 iterations.
   Solving U235 system ... 13 iterations.
   Solving Th232 system ... 1 iterations.
   Solving K40 system ... 10 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.08477e-08, 6.53581e-13, 2.62766e-10, 2.5315e-06, 1.9697e-12, 1.48773e-07, 1.06343e-05
      Relative nonlinear residual (total system) after nonlinear iteration 4: 1.06343e-05

   Solving temperature system... 12 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 1 iterations.
   Solving U235 system ... 10 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 6 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.43424e-10, 6.54363e-13, 2.84011e-12, 1.58347e-07, 4.66064e-13, 5.35355e-09, 1.06349e-05
      Relative nonlinear residual (total system) after nonlinear iteration 5: 1.06349e-05

   Solving temperature system... 6 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 7 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 3 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 3.90573e-11, 6.54317e-13, 5.27704e-13, 9.90467e-09, 4.66409e-13, 1.92648e-10, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 6: 1.06347e-05

   Solving temperature system... 2 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 4 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 1 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 2.53068e-12, 6.5432e-13, 5.27697e-13, 6.19543e-10, 4.66384e-13, 6.96174e-12, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 7: 1.06347e-05

   Solving temperature system... 0 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 2 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 0 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.84763e-13, 6.54319e-13, 5.27697e-13, 3.87611e-11, 4.66388e-13, 9.9279e-13, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 8: 1.06347e-05

   Solving temperature system... 0 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 1 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 0 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.83705e-13, 6.54319e-13, 5.27697e-13, 2.5126e-12, 4.66386e-13, 9.9279e-13, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 9: 1.06347e-05

   Solving temperature system... 0 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 0 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 0 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.83795e-13, 6.54319e-13, 5.27697e-13, 3.44719e-13, 4.66387e-13, 9.92789e-13, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 10: 1.06347e-05

   Solving temperature system... 0 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 0 iterations.
   Solving U235 system ... 0 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 0 iterations.
   Solving Stokes system (GMG)... 154+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 6.83795e-13, 6.54319e-13, 5.27697e-13, 3.44719e-13, 4.66386e-13, 9.9279e-13, 1.06347e-05
      Relative nonlinear residual (total system) after nonlinear iteration 11: 1.06347e-05

   ... THIS KEEPS GOING, SAME TOTAL RESIDUAL ...

At step 41, a refinement happens, which gives 820,000 active cells, 31 million DoF; At step 61, another refinement gives 1 million active cells, 39 million DoF. So the mesh becomes better, intuitively the Stokes should converge as before.

Part of the prm is also attached in case it’s useful:

set Additional shared libraries                               = ./src/libLaneuville_Material_Model.so, ./src/libLaneuville_Heating_Model.so, ./src/libLaneuville_Temperature_Boundary.so

set Dimension                                                 = 3
set Use years in output instead of seconds                    = true
set End time                                                  = 4.5e9
set Maximum time step                                         = 1.0e8
set Maximum first time step                                   = 1.0e3
set CFL number                                                = 0.5
set Maximum relative increase in time step                    = 50
set Output directory                                          = C0LB_3D_test1
set Resume computation                                        = false

set Nonlinear solver scheme                                   = iterated Advection and Stokes
set Max nonlinear iterations                                  = 20
set Nonlinear solver tolerance                                = 1e-6

subsection Solver parameters
  subsection Stokes solver parameters
    set Stokes solver type                                    = block GMG
    set Linear solver tolerance                               = 1e-9
    set Number of cheap Stokes solver steps                   = 300
    set GMRES solver restart length                           = 200
  end
end

subsection Discretization
  set Composition polynomial degree                           = 1
  set Temperature polynomial degree                           = 1
  set Use discontinuous composition discretization            = false

  subsection Stabilization parameters
    set Stabilization method                                  = entropy viscosity # SUPG
    set beta                                                  = 0.117 #0.078 for 2D, 0.117 for 3D
    set cR                                                    = 0.33
  end
end

subsection Formulation
  set Formulation = Boussinesq approximation
end

subsection Melt settings
  set Include melt transport                                  = false
end

subsection Compositional fields
  set Number of fields                                        = 5
  set Names of fields                                         = peridotite, U238, U235, Th232, K40
  set Types of fields                                         = chemical composition
  set Compositional field methods                             = field
end

subsection Initial composition model
  set Model name = function

  subsection Function
    set Coordinate system                                     = spherical
    set Variable names                                        = r, phi, theta
    set Function constants                                    = ...
    set Function expression                                   = ...
  end
end

subsection Boundary composition model
  set List of model names                                     = initial composition
end

subsection Material model
  set Material averaging                                      = geometric average only viscosity
  set Model name                                              = laneuville

  subsection Laneuville model
    set Chemical density difference                           = 60.0
    set Reference density                                     = 3400.0
    set Reference specific heat                               = 1000.0
    set Reference temperature                                 = 1600
    set Mantle thermal conductivity                           = 3.0
    set Crust thermal conductivity                            = 1.5
    set Reference thermal diffusivity                         = 1.0e-6
    set Thermal expansivity                                   = 2.0e-5
    set Activation energy                                     = 3.0e5
    set Reference viscosity                                   = 1.0e30 # so that we have a large constant viscosity
    set Max viscosity                                         = 1.0e28
    set Max melt fraction                                     = 0.3
    set Core density                                          = 7.4e3
    set Core specific heat                                    = 8.0e2
    
    set Physical radius of thermal conductivity interface     = 1.7e6
  end
end

subsection Geometry model
  set Model name                                              = spherical shell

  subsection Spherical shell
    set Inner radius                                          = 390000.0
    set Outer radius                                          = 1740000.0
    #set Cells along circumference                             = 64
    set Opening angle                                         = 360
  end
end

subsection Boundary velocity model
  set Zero velocity boundary indicators                       = 
  set Tangential velocity boundary indicators                 = top, bottom # top and bottom are slip-free
end

subsection Boundary temperature model
  set Fixed temperature boundary indicators                   = inner, outer #Dirichlet BC, prescribed values
  
  set Model name                                              = laneuville

  subsection Laneuville model
    set Initial bottom temperature                            = 2000.0
    set Initial top temperature                               = 250.0
  end
end

subsection Heating model
  set List of model names                                     = laneuville

  subsection Laneuville model
    set Latent heat                                           = 6.0e5
    set Phase change smooth factor                            = 1.0e-3
    set Partition coefficient of U238 (melt/mantle)           = 1.0
    set Partition coefficient of U235 (melt/mantle)           = 1.0
    set Partition coefficient of Th232 (melt/mantle)          = 1.0
    set Partition coefficient of K40 (melt/mantle)            = 1.0
  end
end

subsection Initial temperature model
  set Model name                                              = function

  subsection Function
    set Coordinate system                                     = spherical
    set Variable names                                        = r, phi, theta
    
    set Function constants  = ...

    set Function expression = ...
  end
end

subsection Gravity model
  set Model name                                              = radial constant
  
  subsection Radial constant
    set Magnitude                                             = 1.62
  end
end

subsection Mesh refinement
  set Initial global refinement                               = 5
  set Initial adaptive refinement                             = 2
  set Strategy                                                = temperature#, viscosity
  set Refinement fraction                                     = 0.2
  set Coarsening fraction                                     = 0.1
  set Time steps between mesh refinement                      = 20
end

subsection Postprocess
... 
end

subsection Nullspace removal
  set Remove nullspace                                        = net rotation
end

subsection Checkpointing
  set Steps between checkpoint                                = 50
end

I also tried AMG which requires more memory. When initial global refinement=5 and initial adaptive refinement=2, 512 GB of memory is used up. AMG also struggles after some 20 steps due to the stagnation of Stokes residual. So my question is: is 1.0e-5 acceptable for such a 3D model? How to improve the convergence rate? is AMG still a backup? Please note that the viscosity over the domain is almost a large constant for this test (Reference viscosity = 1.0e30), I guess the problem would becomes much worse if large contrast of viscosity is present (by setting Reference viscosity = 1.0e21 as I intend to do so).

Any ideas or comments are appreciated, thanks!

MM

Update to this thread:

I set steps between checkpoint=50, so when I resume from step 50, the convergence after Timestep = 61 (including 61) are almost the same as preceding steps:

*** Timestep 61:  t=3.45567e+09 years, dt=1e+08 years
   Solving temperature system... 35 iterations.
   Solving peridotite system ... 8 iterations.
   Solving U238 system ... 18 iterations.
   Solving U235 system ... 22 iterations.
   Solving Th232 system ... 15 iterations.
   Solving K40 system ... 21 iterations.
   Solving Stokes system (GMG)... 134+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes sy
stem): 0.00215462, 6.24459e-09, 0.000243117, 0.0103438, 2.45509e-05, 0.00319275,
 0.00073863
      Relative nonlinear residual (total system) after nonlinear iteration 1: 0.
0103438

   Solving temperature system... 28 iterations.
   Solving peridotite system ... 2 iterations.
   Solving U238 system ... 13 iterations.
   Solving U235 system ... 19 iterations.
   Solving Th232 system ... 9 iterations.
   Solving K40 system ... 17 iterations.
   Solving Stokes system (GMG)... 59+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes sy
stem): 4.70636e-06, 1.64717e-11, 2.49542e-06, 0.000647014, 8.07835e-08, 0.000114
89, 8.10991e-07
      Relative nonlinear residual (total system) after nonlinear iteration 2: 0.
000647014

   Solving temperature system... 23 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 8 iterations.
   Solving U235 system ... 16 iterations.
   Solving Th232 system ... 3 iterations.
   Solving K40 system ... 13 iterations.
   Solving Stokes system (GMG)... 55+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes sy
stem): 2.30474e-07, 7.56633e-13, 2.56106e-08, 4.04711e-05, 2.64653e-10, 4.1343e-
06, 8.05186e-07
      Relative nonlinear residual (total system) after nonlinear iteration 3: 4.04711e-05

   Solving temperature system... 17 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 3 iterations.
   Solving U235 system ... 13 iterations.
   Solving Th232 system ... 1 iterations.
   Solving K40 system ... 10 iterations.
   Solving Stokes system (GMG)... 53+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.30114e-08, 7.38509e-13, 2.62679e-10, 2.53149e-06, 2.11534e-12, 1.48772e-07, 8.05087e-07
      Relative nonlinear residual (total system) after nonlinear iteration 4: 2.53149e-06

   Solving temperature system... 12 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 1 iterations.
   Solving U235 system ... 10 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 6 iterations.
   Solving Stokes system (GMG)... 52+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 7.7176e-10, 7.39519e-13, 2.83527e-12, 1.58346e-07, 4.85132e-13, 5.3535e-09, 8.04331e-07
      Relative nonlinear residual (total system) after nonlinear iteration 5: 8.04331e-07


   Postprocessing:
     Writing graphical output:           C0LB_3D_test0/solution/solution-00045
     RMS, max velocity:                  1.69e-09 m/year, 5.55e-09 m/year
     Heat fluxes through boundary parts: -5.621e+09 W, 3.638e+11 W
     CMB Heat flow nearside/farside:     -2762035879.009386/-2858901060.727083
     Writing depth average:              C0LB_3D_test0/depth_average
     Compositions min/max/mass:          -0.5036/-0.08768/-3.315e+18 // 7.284e-09/2.004e-06/5.423e+11 // 1.423e-10/3.918e-08/1.06e+10 // 2.423e-08/6.669e-06/1.804e+12 // 3.458e-09/9.518e-07/2.575e+11

*** Timestep 62:  t=3.55567e+09 years, dt=1e+08 years
   Solving temperature system... 34 iterations.
   Solving peridotite system ... 7 iterations.
   Solving U238 system ... 18 iterations.
   Solving U235 system ... 22 iterations.
   Solving Th232 system ... 15 iterations.
   Solving K40 system ... 21 iterations.
   Solving Stokes system (GMG)... 108+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 0.00190894, 2.1589e-09, 0.000243203, 0.0103439, 2.45778e-05, 0.00319284, 0.000746974
      Relative nonlinear residual (total system) after nonlinear iteration 1: 0.0103439

   Solving temperature system... 28 iterations.
   Solving peridotite system ... 2 iterations.
   Solving U238 system ... 13 iterations.
   Solving U235 system ... 19 iterations.
   Solving Th232 system ... 9 iterations.
   Solving K40 system ... 17 iterations.
   Solving Stokes system (GMG)... 59+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 4.42256e-06, 1.59122e-11, 2.49631e-06, 0.00064702, 8.08653e-08, 0.000114894, 8.10349e-07
      Relative nonlinear residual (total system) after nonlinear iteration 2: 0.00064702

   Solving temperature system... 23 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 8 iterations.
   Solving U235 system ... 16 iterations.
   Solving Th232 system ... 3 iterations.
   Solving K40 system ... 13 iterations.
   Solving Stokes system (GMG)... 55+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 2.14078e-07, 7.18689e-13, 2.562e-08, 4.04715e-05, 2.64552e-10, 4.13442e-06, 8.05264e-07
      Relative nonlinear residual (total system) after nonlinear iteration 3: 4.04715e-05

   Solving temperature system... 17 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 3 iterations.
   Solving U235 system ... 13 iterations.
   Solving Th232 system ... 1 iterations.
   Solving K40 system ... 10 iterations.
   Solving Stokes system (GMG)... 53+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 1.20324e-08, 7.05529e-13, 2.6279e-10, 2.53151e-06, 2.00663e-12, 1.48776e-07, 8.05113e-07
      Relative nonlinear residual (total system) after nonlinear iteration 4: 2.53151e-06

   Solving temperature system... 12 iterations.
   Solving peridotite system ... 0 iterations.
   Solving U238 system ... 1 iterations.
   Solving U235 system ... 10 iterations.
   Solving Th232 system ... 0 iterations.
   Solving K40 system ... 6 iterations.
   Solving Stokes system (GMG)... 52+0 iterations.
      Relative nonlinear residuals (temperature, compositional fields, Stokes system): 7.12177e-10, 7.06256e-13, 2.83693e-12, 1.58348e-07, 4.65912e-13, 5.35366e-09, 8.04349e-07
      Relative nonlinear residual (total system) after nonlinear iteration 5: 8.04349e-07


   Postprocessing:
...

I’m pretty happy that it’s converged, but I’m really concerned about why. Maybe this is the way to improve the convergence rate?

@optimux - Thank you for posting the question to the forum and apologies that it took a bit longer than normal to reply.

Yes, that is a bit concerning. To confirm, what you are seeing is that the convergence behavior is different for the same time steps (or points in time) depending on whether or not you restart from a check point midway through the solution?

If this indeed the case, can you identify from the solution itself how the models are diverging?

I looked through the PRM file, but hesitant to do too much analysis as the model is using a custom material (i.e., it would be helpful to know more about this first).

However, here are a few more general comments and questions about the model setup:

  1. What was the motivation for reducing Composition polynomial degree and Temperature polynomial degree from 2 to 1?
  2. Likewise what was the motivation for changing the SUPG stabilization parameters (beta, cR) from there defaults?
  3. Are you doing averaging of properties between fields in the material model?
  4. Does convergence behavior improve if you reduce the CFL value?

Cheers,

John

Hi @jbnaliboff ,
Thanks for your quick reply. Since I have moved forward a bit, the “60 step convergence” case has been deleted and I’m afraid that is not a reliable way to achieve convergence.

To answer your questions. 1. They used to be 2 but it slows down the code a bit (maybe uses more memory as well, I forgot to see it). So basically I just want to speedup. 2. They are actually default values. 3. In my plugin, I guess not; in prm, I use “geometric average only viscosity”.

For question 4, after many attempts, I found the code is sensitive to the mesh. If mesh refinement happens, the code can hardly resolve whatever small dt or large tolerance used. For example, before refinement I have 98 million DoFs, 2 more levels of refinement would decrease it to 24 million DoFs (active cells drops a lot too). Now keep the mesh unchanged, large tolerance (less strict) would of course allow large dt, but dt also has an upper limit. What I’m struggling with is to find a relative large dt with tolerance as small as possible (i.e., to speedup but better solution). The machine is 4 years old, 64 cores (some 2.2 GHz), memory bandwidth is not large I guess. If dt=1.0e6 years, the total time 4.56 billion years requires 4600 steps, each step has hundreds of iterations. This model cannot be done in a couple of days. The model is actually from Laneuville2013 paper(Asymmetric thermal evolution of the Moon), they used GAIA, a finite volume code. I cannot believe that using the most advanced FEM library, solver backends and hardware, I still cannot model something that has been published 14 years ago. There must be something missing in my prm or something I don’t understand in tuning ASPECT. I can paste all plugins and prm if you’d like to take a look. Some comments in the plugins are outdated and could be misleading! Really appreciate your help!

Regards,
MM
laneuville_heat_flow.cc.txt (5.7 KB)
laneuville_heat_flow.h.txt (26.1 KB)
laneuville_heating_model.cc.txt (8.4 KB)
laneuville_heating_model.h.txt (2.8 KB)
laneuville_material_model.cc.txt (12.8 KB)
laneuville_material_model.h.txt (4.7 KB)
laneuville_shared_state.cc.txt (284 Bytes)
laneuville_shared_state.h.txt (511 Bytes)
laneuville_temperature_boundary.cc.txt (3.6 KB)
laneuville_temperature_boundary.h.txt (2.7 KB)
laneuville2013.prm (12.9 KB)

Hi @jbnaliboff ,
Thanks for your quick reply. Since I have moved forward a bit, the “60 step convergence” case has been deleted and I’m afraid that is not a reliable way to achieve convergence.

To answer your questions. 1. They used to be 2 but it slows down the code a bit (maybe uses more memory as well, I forgot to see it). So basically I just want to speedup. 2. They are actually default values. 3. In my plugin, I guess not; in prm, I use “geometric average only viscosity”.

For question 4, after many attempts, I found the code is sensitive to the mesh. If mesh refinement happens, the code can hardly resolve whatever small dt or large tolerance used. For example, before refinement I have 98 million DoFs, 2 more levels of refinement would decrease it to 24 million DoFs (active cells drops a lot too). Now keep the mesh unchanged, large tolerance (less strict) would of course allow large dt, but dt also has an upper limit. What I’m struggling with is to find a relative large dt with tolerance as small as possible (i.e., to speedup but better solution). The machine is 4 years old, 64 cores (some 2.2 GHz), memory bandwidth is not large I guess. If dt=1.0e6 years, the total time 4.56 billion years requires 4600 steps, each step has hundreds of iterations. This model cannot be done in a couple of days. The model is actually from Laneuville2013 paper(Asymmetric thermal evolution of the Moon), they used GAIA, a finite volume code. I cannot believe that using the most advanced FEM library, solver backends and hardware, I still cannot model something that has been published 14 years ago. There must be something missing in my prm or something I don’t understand in tuning ASPECT. I can paste all plugins and prm if you’d like to take a look. Some comments in the plugins are outdated and could be misleading! Really appreciate your help!

Regards,
MM
laneuville_heat_flow.cc.txt (5.7 KB)
laneuville_heat_flow.h.txt (26.1 KB)
laneuville_heating_model.cc.txt (8.4 KB)
laneuville_heating_model.h.txt (2.8 KB)
laneuville_material_model.cc.txt (12.8 KB)
laneuville_material_model.h.txt (4.7 KB)
laneuville_shared_state.cc.txt (284 Bytes)
laneuville_shared_state.h.txt (511 Bytes)
laneuville_temperature_boundary.cc.txt (3.6 KB)
laneuville_temperature_boundary.h.txt (2.7 KB)
laneuville2013.prm (12.9 KB)

One missing file (cmake):
CMakeLists.txt (783 Bytes)

Additional info found. Given Initial global refinement = 5, if the solver fails to converge (due to large dt or small convergence tolerance or whatsoever), it always complains:

<xx/dealii-9.6.2/include/deal.II/lac/solver_cg.h>
in function
void dealii::SolverCG::solve(const MatrixType&,
VectorType&, const VectorType&, const PreconditionerType&) [with
MatrixType = aspect::MatrixFreeStokesOperators::MassMatrixOperator<3,
1, double>; PreconditionerType = dealii::PreconditionMG<3,
dealii::LinearAlgebra::distributed::Vector,
dealii::MGTransferMF<3, double> >; VectorType =
dealii::LinearAlgebra::distributed::Vector]
The violated condition was:
solver_state == SolverControl::success
Additional information:
Iterative method reported convergence failure in step 100. The
residual in the last step was 1.36073e-06.

I found that 100 steps is hardcoded in constructor
explicit SolverControl(const unsigned int n = 100,
const double tol = 1.e-10,
const bool log_history = false,
const bool log_result = true);

in directory deal.ii/include/deal.II/lac/solver_control.h. This means I cannot change it, otherwise I have to recompile ASPECT. I don’t understand why the max number of iteration is set 100, is 100 iterations large enough? I have seen this 100-step-failure many time in my tests. I’m still unaware of what to do to improve the convergence and speed. Any ideas?

Thanks,
MM

I’ve been following this topic because we have had some convergence problems with ASPECT on adaptively refined meshes as well. I can offer a few suggestions.

  1. Using higher order elements in general should give you much improved accuracy for a given cell size. The use of non-standard elements also means that you are working with features of the code that are less extensively tested and in my experience this means that you’re more likely to encounter problems.
  2. In Laneuville et al. (2013), the rheology is Newtonian. Is your model non-Newtonian? If so, this is a major difference that will significantly increase the computational cost. I’m not familiar with the solver design in GAIA, but it may very well not have any nonlinear iteration if the Stokes and energy equations are solved alternately at each timestep, as is done in many other geodynamic modeling codes. This is something that you can try in ASPECT as well (it is the default nonlinear solver scheme - ‘single Advection, single Stokes’).
  3. If you are dealing with very strongly temperature dependent viscosity and seeing convergence problems, you can try: (a) do harmonic averaging of viscosity and (b) increase the GMRES solver restart length.
  4. Using particles for the compositional field in ASPECT might be a good idea because they preserve sharp interfaces and are not susceptible to numerical diffusion or over/under-shoots associated with solving an advection equation for composition.
  5. I’m not sure how many cores were used in Laneuville et al. 2013, but most 3D models of that era were run on clusters, not on a single node. For 3D regional models, we typically use ~500 cores as a starting point, keeping around 100,000-200,000 degrees of freedom per core.
  6. Default Arguments in C++ - GeeksforGeeks

Hi Max,
Thanks for your suggestions.

  1. Yes absolutely. I used polynomial of degree 1 for compositions and temperature because during the test the memory exploded if the polynomial of degree is 2 and initial global refinement=5, initial adaptive refinement=2. So I was trying to reduce the memory usage by using polynomial of degree 1. Later I found that mesh refinement (whatever initial adaptive or runtime refinement) would decrease the number of DoFs and active cells dramatically (a couple of time less) which leads to solver failure easily. So now I’m using no initial adaptive refinement or runtime refinement, so far using polynomial of degree 2 looks good, memory is under control. I did find some entries for tuning mesh refinement (e.g., refinement fraction), but that requires some time and experience to test.
  2. I followed their EXACT equation, \eta(T) = \exp(\frac{E}{T+T_{surf}} - \frac{E}{T+T_0}) where E is activation energy, T is temperature, T_{surf} and T_0 are surface temperature and reference temperature respectively. For the temperature range in their paper, the viscosity covers (3 \times 10^{19}) \sim 10^{28}, some 8 orders of magnitude, and this large span almost happens near mantle-crust interface, some 40 km below the surface. Additionally I also tried “single Advection, single Stokes” but it did not help much.
  3. Yes, harmonic/geometric averaging of viscosity and GMRES solver length have been tested. I even tried " project to Q1 only viscosity", but not helpful.
  4. The authors used particles for compositions. However, the convergence problem always comes from Stokes subsystem, compositions converge pretty fast. I also tried to unify all compositions to one nominal composition, but later I verified that all compositions should be solved because of their independency.
  5. That’s a bit concerning. It would be a nightmare to have thousands of cores running for months! The problem is that when you increase the size of DoFs, you use a lot of memory, the memory bandwidth and infinite network bandwidth are too slow to feed CPUs, so yes you have a large problem and you would produce fine internal structure, but each core is running slowly. This morning following your suggestions, I’m using polynomial of degree 2 for composition and temperature, the memory usage is pretty stable, some 370 GB, 230 million of DoFs on 64 cores.

Let me finish the tests of polynomial of degree 2 first and I’ll get back to you.

Cheers,
MM