Spontaneous slip events of faults driven by rate boundary conditions [pylith2.2.2]

Dear PyLith Team,

I hope this email finds you well. First, I would like to sincerely apologize for reaching out multiple times and for any inconvenience my questions may have caused. I deeply appreciate your time and efforts in addressing user inquiries.

The purpose of this email is to kindly request your guidance on a few challenges I am encountering during numerical simulations with PyLith. Specifically, I am struggling with the following aspects: mesh/PETSc/coordinate/runtime

To provide more clarity, I have attached a compressed file to this email. Inside the archive, you will find a file named question.md, which contains detailed descriptions of the issues I am facing. Additionally, the archive includes my configuration files and the model setup, which may help in understanding and diagnosing the problems.

I would greatly appreciate it if you could provide insights or point me to relevant resources to address these issues. Your expertise would be invaluable in helping me move forward with my research.

Thank you very much for your time and support. Please let me know if you require any further information.

Best regards,
Fallow
question.zip (7.6 MB)

I am out of the office for the holidays and will try to respond to your questions sometime in the next week.

Thank you for letting me know. I truly appreciate your willingness to respond despite being on holiday. I will wait for your reply next week. Wishing you a joyful holiday season and a Merry Christmas!

Mesh quality

The maximum condition number for cells in your mesh is 2.4. This should be okay for a starting point for quasistatic simulations. We recommend examining the solution to verify it is smooth and resolves the fields you are interested in. For example, if you are interested in the stress field (which is uniform within a cell in PyLith v2), then you need a finer mesh than if you are only interested in the displacement field. Nevertheless, a smooth stress field gives you a good indication of how well resolved the solution is.

Sometimes improving the mesh quality results in inverted cells. This may be the case for your mesh. Try dropping set tetmesher optimize sliver on and only running the smoothing once.

Coordinate system

Your coordinate system in pylithapp.cfg is specified incorrectly. You have

coordsys.projector.projection = tmerc
coordsys.projector.proj_options = +proj=utm +zone=48 +datum=WGS84 +units=m +ellps=WGS84 +north

For a UTM projection, the projection is utm, not tmerc. You should have

coordsys.projector.projection = utm
coordsys.projector.proj_options = +zone=48 +datum=WGS84 +units=m +north

Spatial databases

In most cases, you should not use a geographic coordinate system for points in a spatial database. The interpolation algorithm assumes “distance” has the same meaning for each coordinate. For a geographic coordinate system it will assume 1 degree longitude or latitude is the same distance as 1 m in elevation. Because this is not the case (1 degree is approximately 111 km), the interpolation will not be correct. The only case where it works is if there is a single point in the spatial database (equivalent to a UniformDB) or if the points are only a function of depth (vertical line) or only a function of longitude and latitude (horizontal line or plane). In your case, I recommend using UTM for the mesh and the spatial databases.

Solver

We strongly recommend using the fieldsplit solver settings. The ASM preconditioner only works well for small problems.

Fault opening errors

Nonplanar faults will generate opening. The smoother the faults, the less opening there will be. With spontaneous rupture (fault friction), the only way to mitigate the fault opening is to increase the zero_tolerance_normal parameter. How large this needs to be depends on how discontinuous the fault orientation is and how large the slip is.

I truly appreciate your assistance and apologize for the delayed response.
At your suggestion, I turned off the ‘set tetmesher optimize sliver’ option and just ran the smoothing step:

cubit.cmd(f"volume 9 21 22 23 24 25 size {DX_FAULT}")
cubit.cmd(f"volume 12 13 size {DX_FAULT2}")
cubit.cmd("volume all scheme tetmesh")
cubit.cmd("mesh volume all")
cubit.cmd(f"volume all smooth scheme condition number beta 1.8 cpu 2")
cubit.cmd("smooth volume all")

But when I checked the.exo file, I found nans in the coordinates. In the gui interface of cubit, I found that most of these nans were in the middle of two close faults. Then I encrypted the number of elements in the upper crust, and the nans situation disappeared.
According to your suggestion, I modified the coordinate declaration in pylithapp.cfg file to:

coordsys = spatialdata.geocoords.CSGeoProj
coordsys.space_dim = 3
coordsys.datum_horiz = WGS84
coordsys.datum_vert = mean sea level
coordsys.projector.projection = utm
coordsys.projector.proj_options = +zone=48 +datum=WGS84 +units=m +north

And in the spatialdb file:

SimpleDB {
  num-values = 3
  value-names =  displacement-rate-x  displacement-rate-y  rate-start-time
  value-units =  mm/year  mm/year  year
  num-locs = 154
  data-dim = 2
  space-dim = 3
  cs-data = geo-projected {
    to-meters = 1.0
    ellipsoid = WGS84
    datum-horiz = WGS84
    datum-vert = mean sea level
    projector = projection {
      projection = utm
      units = m
      proj-options = +zone=48
    }
  }
}
516150.91204381164     3691155.75424592     0.0     8.167     -0.7557     0.0
516150.91204381164     3691155.75424592     -80000.0     8.167     -0.7557     0.0
...

As for the Solver part, I left it unchanged and used the following Settings:

split_fields = True
matrix_type = aij
use_custom_constraint_pc = True

snes_view = true
ksp_monitor_true_residual = true
fs_pc_type = fieldsplit
fs_pc_use_amat = True
fs_pc_fieldsplit_type = multiplicative
fs_fieldsplit_displacement_pc_type = ml
fs_fieldsplit_lagrange_multiplier_pc_type = jacobi
fs_fieldsplit_displacement_ksp_type = preonly
fs_fieldsplit_lagrange_multiplier_ksp_type = preonly

The general changes are as described above. Here’s the situation: I ran 28 nodes on the server for roughly 10 hours, but didn’t complete a single time step:

Preparing to advance solution from time t=0*s to t=3.15576e+07*s.
...
[II] fixing the coarse-level matrix dead dofs
Timestamp                     Simulation t   % complete   Est. completion
2025-01-10 19:51:25.413367       0.00*year            0   TBD

I don’t know if this simulation is running properly.

To understand what the solver is doing, I recommend turning on the monitoring options using the following PETSc options:

[pylithapp.petsc]
ksp_monitor_true_residual = true
ksp_converged_reason = true
ksp_error_if_not_converged = true
ksp_max_it = 2000

snes_monitor_true_residual = true
snes_converged_reason = true
snes_error_if_not_converged = true
snes_max_it = 500

The solver will show the true and preconditioned residuals after each iteration. The solver will also report the test associated with why it did or did not converge. These settings also limit the number of iterations to something that are reasonable; this will generate an error if the number of iterations is exceeded so that you will get notified quicker if the solve is not converging. You may need to increase the number of iterations for your problem, but I would wait until you verify that the residuals are decreasing first.

Thank you for your reply. About the monitoring option, I’ve always turned it on.
Here are my monitoring options and iteration Settings:

# ----------------------------------------------------------------------
# PETSc
# ----------------------------------------------------------------------
[pylithapp.problem.formulation]
split_fields = True
matrix_type = aij
use_custom_constraint_pc = True

[pylithapp.petsc]
snes_view = true
ksp_monitor_true_residual = true
fs_pc_type = fieldsplit
fs_pc_use_amat = True
fs_pc_fieldsplit_type = multiplicative
fs_fieldsplit_displacement_pc_type = ml
fs_fieldsplit_lagrange_multiplier_pc_type = jacobi
fs_fieldsplit_displacement_ksp_type = preonly
fs_fieldsplit_lagrange_multiplier_ksp_type = preonly

#[pylithapp.petsc]
malloc_dump =


ksp_rtol = 1.0e-20
ksp_atol = 1.0e-10
ksp_max_it = 10000
ksp_gmres_restart = 50

ksp_monitor = true
ksp_view = true
ksp_converged_reason = true 
ksp_error_if_not_converged = true

snes_rtol = 1.0e-20
snes_atol = 1.0e-7
snes_max_it = 10000
snes_monitor = true
snes_linesearch_monitor = true
snes_converged_reason = true
snes_error_if_not_converged = true

log_view = true

I run it with the following command:

pylith XXX.cfg --nodes=42 > log.txt

Command line feedback:

[II] fixing the coarse-level matrix dead dofs
[II] fixing the coarse-level matrix dead dofs
[II] fixing the coarse-level matrix dead dofs
...(A lot of the same feedback as above)

Extracting information from the contents of log.txt, the norm of snes from the second iteration is as follows:

log file I attached at the end of the article
log.txt (6.9 MB)

The solution appears to be converging, just very slowly. When you have faults that extend through much of the domain, this can happen. It is a deficiency of the friction formulation in PyLith v2 and something we are trying to correct in our new friction formulation.

The faults are slipping at the first time step, so you are not starting with locked faults or a starting point in which the faults are just at the failure threshold. If the faults are only slipping in each SNES iteration a small fraction of the amount necessary to reach equilibrium, then it make will take many SNES iterations to reach convergence. If you starting point is closer to equilibrium it should take fewer SNES iterations.

It may not be affecting the solve, but I noticed the friction solve did not meet the convergence criteria (DIVERGED_ITS). I would increase the number of KSP iterations for the friction solve to 50.

I would also remove ksp_view=true and snes_view=true as they are generating a lot of output that isn’t very useful.

Thank you for reminding me that the friction solve did not meet the tolerance requirement in the 25 iterations set.
I tried to modify the number of iterations of the friction solve, but 50 iterations were still not enough, so I increased the number of iterations directly to 200.

friction_pc_type = asm
friction_sub_pc_factor_shift_type = nonzero
friction_ksp_max_it = 200
friction_ksp_gmres_restart = 30
# Uncomment to view details of friction sensitivity solve.
friction_ksp_monitor = true
friction_ksp_converged_reason = true
log_view = true

I have a concern if, after running for a long time, at a certain step, the number of iterations of a solver specified by the parameter does not meet convergence. At that point, the Sunk Cost will be huge. This is because the running time is so long that it is difficult for me to change the parameter configuration in time. Please let me know if you have any recommendations or if there are any best practices I should consider.

As in most large numerical models, the key to efficiently running large, complex simulations is to start with small, simple simulations and slowly add complexity. This helps build intuition, verify problem setup and analysis, and have confidence in your results. We recommend starting in 2D as the models are much smaller, can usually run on a laptop, and make it easy to debug problem setup and the general workflow.

Thanks again for your help. Your advice is very useful to me as a new person in numerical simulation.