Quasistatic fault nonlinear convergence

Dear Pylith developers,

I’m modeling a fault in a 2D (plane strain) elastic block using FaultCohesiveDyn. I have a horizontal planar fault with constant friction everywhere.
I prescribe initial shear and normal stresses in the domain and then load the top boundary with a shear stress rate of 0.5 MPa per year. I want the simulation to run for 4.6 years and have the slip profiles for several timesteps in-between.
total_time = 4.6year
max_dt = 0.2
year

The simulation diverges when the slip on the fault is supposed to start.
Here are my settings. I can’t attach the log file, it is too heavy, but I copy the final error. I’ve tried with and without the fault specific preconditioner, with the preconditioned SNES diverges at first timestep. If I decrease the timestep the simulation runs for longer and I get slip along the fault, but it still takes a lot of iterations to converge and I run out of cluster time.

I though this is supposed to be a very basic problem. Is there something very wrong with my settings? Please advise what’s best to do.

----------------------------------------------------------------------

faults

----------------------------------------------------------------------

[pylithapp.timedependent.interfaces]

Change fault to dynamic fault interface.

fault = pylith.faults.FaultCohesiveDyn

[pylithapp.timedependent.interfaces.fault]
zero_tolerance = 1.0e-8

The label corresponds to the name of the nodeset in CUBIT.

label = fault
edge = fault_edge

We must define the quadrature information for fault cells.

The fault cells are 1D (surface).

quadrature.cell = pylith.feassemble.FIATSimplex
quadrature.cell.dimension = 1

Use rate- and state-friction with the ageing law for evolution of

the state variable.

friction = pylith.friction.contrib.DoubleSlipWeakeningFrictionNoHeal
friction.label = DoubleSlipWeakeningFrictionNoHeal

friction.db_properties = spatialdata.spatialdb.SimpleDB
friction.db_properties.label = Rate State Ageing
friction.db_properties.iohandler.filename = spatialdb/1_friction_DSW_single_nucl_1m_3m_fault.spatialdb

----------------------------------------------------------------------

PETSc settings

----------------------------------------------------------------------

NOTE: There are additional settings specific to fault friction.

#[pylithapp.timedependent.formulation]
#split_fields = True
#use_custom_constraint_pc = True
#matrix_type = aij
[pylithapp.petsc]
#fs_pc_type = fieldsplit
#fs_pc_use_amat = true
#fs_pc_fieldsplit_type = multiplicative
#fs_fieldsplit_displacement_pc_type = ml
#fs_fieldsplit_lagrange_multiplier_pc_type = jacobi
#fs_fieldsplit_displacement_ksp_type = preonly
#fs_fieldsplit_lagrange_multiplier_ksp_type = preonly

malloc_dump =

Preconditioner settings.

pc_type = asm
sub_pc_factor_shift_type = nonzero
#pc_type = lu
#pc_factor_mat_solver_type = umfpack

Convergence parameters.

ksp_rtol = 1.0e-12
ksp_atol = 5.0e-9
ksp_max_it = 3000
ksp_gmres_restart = 50

Linear solver monitoring options.

ksp_monitor = true
#ksp_view = true
ksp_converged_reason = true
ksp_error_if_not_converged = true

Nonlinear convergence parameters.

snes_rtol = 1.0e-12
snes_atol = 1.0e-7
snes_max_it = 3000

Nonlinear solver monitoring options.

snes_monitor = true
snes_linesearch_monitor = true
#snes_view = true
snes_converged_reason = true
snes_error_if_not_converged = true

PETSc summary – useful for performance information.

log_summary = true

Uncomment to launch gdb when starting PyLith.

start_in_debugger = true

Friction sensitivity solve used to compute the increment in slip

associated with changes in the Lagrange multiplier imposed by the

fault constitutive model.

friction_pc_type = asm
friction_sub_pc_factor_shift_type = nonzero
friction_ksp_max_it = 50
friction_ksp_gmres_restart = 30

Uncomment to view details of friction sensitivity solve.

friction_ksp_monitor = true
friction_ksp_view = true
friction_ksp_converged_reason = true

Error:
PC Object: (friction_) 1 MPI processes
type: asm
total subdomain blocks = 1, amount of overlap = 1
restriction/interpolation type - RESTRICT
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (friction_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (friction_sub_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift to prevent zero pivot [NONZERO]
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=1198, cols=1198
package used to perform factorization: petsc
total: nonzeros=7180, allocated nonzeros=7180
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 599 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=1198, cols=1198
total: nonzeros=7180, allocated nonzeros=7180
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 599 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=1198, cols=1198, bs=2
total: nonzeros=7180, allocated nonzeros=7180
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 599 nodes, limit used is 5
3000 SNES Function norm 1.831416565858e-07
Nonlinear solve did not converge due to DIVERGED_MAX_IT iterations 3000
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR:
[0]PETSC ERROR: SNESSolve has not converged
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.10.2, Jul, 01, 2019
[0]PETSC ERROR: /home/bolee/PYLITH_FROM_SOURCE/pylith/bin/mpinemesis on a named node127 by bolee Wed Nov 23 12:49:57 2022
[0]PETSC ERROR: Configure options --prefix=/home/bolee/PYLITH_FROM_SOURCE/pylith --with-c2html=0 --with-x=0 --with-clanguage=C --with-mpicompilers=1 --with-shared-libraries=1 --with-64-bit-points=1 --with-large-file-io=1 --download-chaco=1 --download-ml=1 --download-f2cblaslapack=1 --with-hdf5=1 --with-hdf5-dir=/home/bolee/PYLITH_FROM_SOURCE/pylith --with-zlib=1 --LIBS=-lz --with-debugging=0 --with-fc=0 CPPFLAGS=“-I/home/bolee/PYLITH_FROM_SOURCE/pylith/include -I/home/bolee/PYLITH_FROM_SOURCE/pylith/include " LDFLAGS=”-L/home/bolee/PYLITH_FROM_SOURCE/pylith/lib -L/home/bolee/PYLITH_FROM_SOURCE/pylith/lib64 -L/home/bolee/PYLITH_FROM_SOURCE/pylith/lib -L/home/bolee/PYLITH_FROM_SOURCE/pylith/lib64 " CFLAGS=“-g -O2” CXXFLAGS=“-g -O2 -DMPICH_IGNORE_CXX_SEEK” FCFLAGS= PETSC_DIR=/home/bolee/PYLITH_FROM_SOURCE/build/petsc-pylith PETSC_ARCH=arch-pylith
[0]PETSC ERROR: #1 SNESSolve() line 4408 in /home/bolee/PYLITH_FROM_SOURCE/build/petsc-pylith/src/snes/interface/snes.c
[0]PETSC ERROR: #2 void pylith::problems::SolverNonlinear::solve(pylith::topology::Field*, pylith::topology::Jacobian*, const pylith::topology::Field&)() line 152 in …/…/…/pylith-2.2.2/libsrc/pylith/problems/SolverNonlinear.cc
Fatal error. Calling MPI_Abort() to abort PyLith application.
Traceback (most recent call last):
File “/home/bolee/PYLITH_FROM_SOURCE/pylith/lib/python2.7/site-packages/pylith/apps/PetscApplication.py”, line 74, in onComputeNodes
self.main(*args, **kwds)
File “/home/bolee/PYLITH_FROM_SOURCE/pylith/lib/python2.7/site-packages/pylith/apps/PyLithApp.py”, line 138, in main
self.problem.run(self)
File “/home/bolee/PYLITH_FROM_SOURCE/pylith/lib/python2.7/site-packages/pylith/problems/TimeDependent.py”, line 203, in run
self.formulation.step(t, dt)
File “/home/bolee/PYLITH_FROM_SOURCE/pylith/lib/python2.7/site-packages/pylith/problems/Implicit.py”, line 212, in step
self.solver.solve(dispIncr, self.jacobian, residual)
File “/home/bolee/PYLITH_FROM_SOURCE/pylith/lib/python2.7/site-packages/pylith/problems/problems.py”, line 186, in solve
def solve(self, *args): return _problems.SolverNonlinear_solve(self, *args)
RuntimeError: Error detected while in PETSc function.

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode -1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

[warn] Epoll MOD(1) on fd 28 failed. Old events were 6; read change was 0 (none); write change was 2 (del): Bad file descriptor
[warn] Epoll MOD(4) on fd 28 failed. Old events were 6; read change was 2 (del); write change was 0 (none): Bad file descriptor
/home/bolee/PYLITH_FROM_SOURCE/pylith/bin/nemesis: mpirun: exit 255
/home/bolee/PYLITH_FROM_SOURCE/pylith/bin/pylith: /home/bolee/PYLITH_FROM_SOURCE/pylith/bin/nemesis: exit 1

I would uncomment at least the first line here. ILU sounds like too weak a preconditioner for your problem.

Thanks,

 Matt

I recommend using the fault preconditioner settings that you have commented out:

[[pylithapp.timedependent.formulation]
split_fields = True
use_custom_constraint_pc = True
matrix_type = aij

[pylithapp.petsc]
fs_pc_type = fieldsplit
fs_pc_use_amat = true
fs_pc_fieldsplit_type = multiplicative
fs_fieldsplit_displacement_pc_type = ml
fs_fieldsplit_lagrange_multiplier_pc_type = jacobi
fs_fieldsplit_displacement_ksp_type = preonly
fs_fieldsplit_lagrange_multiplier_ksp_type = preonly

I would not use the ASM preconditioner as it usually does a very poor job for simulations with a fault.

You also want to force the solver to use the absolute tolerances, so make the KSP and SNES relative tolerances even smaller, like 1.0e-20.

ksp_rtol = 1.0e-20
snes_rtol = 1.0e-20

Make sure the KSP (linear) and nonlinear (SNES) solvers are converging every time due to the absolute tolerance. The reason this is important is that you want the fault sip to the less than the zero tolerance where the fault is locked. If you use loose solver tolerances, then you can end up with fault slip in places where the friction criterion would lock the fault; this leads to divergence of the solution.