PETSc TS options ignored when computing Green's functions & GMRES recursion formula far from computed residual norm

Hi,

I would like to calculate GFs on a fault that has a specific geometry, and a rigidity varying with depth. I started building the mesh using examples like Vertical Cross-Section of a Reverse Fault with Splay & the horizontal strike slip. All is going well when I keep the minimum cell size at 1000m for the mesh (mesh_tri_smtNTGEO_test.msh), and pylith (V3.0.2) works fine.

  1. The problem is, is that I need a much smaller dx size, thus I changed the cell size to 100m first to do a test (mesh_tri_smtNTGEO_testdx100.msh). This gave some errors with pylith (the Linear solve did not converge due to DIVERGED_ITS iterations 10000 that I have seen around on here already), so I saw that I should change the length scale of the problem, which sounds reasonable if I also have a smaller cell size (I changed it to 10m). However, the convergence problem is still there.

  2. To try to counter this, I switched back to the mesh that I was using before (mesh_tri_smtNTGEO_test.msh), kept the length scale at 10m, but now tried to increase the ksp_atol, ksp_rtol, snes_atol, and snes_rtol (since that should help converge the solver, right?), but it seems like pylith is ignoring these settings (Ignoring PETSc options (already set):).

I am not sure what to do, and if it maybe is an error due to gmsh? Even though everything is the same except for the cell size?
Please find attached all the scripts that I am using and the error output.

Thank you!

Problem.zip (891.0 KB)

A factor of 10 decrease in cell size is a huge difference. Do you have sufficient data to resolve fault slip on a 100 m length scale? What happens if you decrease the cell size by a factor of 2 to 500 m (assuming it is justified by your data resolution)?

Hi Brad,

Thanks for your quick response. I have made a little table of the following combinations of triangle size and length scale.
CONVERGES? TRIANGLE SIZE LENGTH SCALE
YES 1000 1000
YES 1000 500
NO 1000 10
NO 500 1000
NO 500 500
NO 500 10
My domain is 60km wide and 5 km deep, therefore a triangle size of 1000m is too big for me.

I think my main questions are:
→ Maybe I am understanding your message wrong, but by sufficient data, do you mean the material.spatialdb?
→ Why does pylith ignore the PETSc options I want to set?

By sufficient data, I am referring to the observations used to constrain your inversion with Green’s functions. See the strikeslip-2d Green’s function example. We use 200 observations (GPS stations) in the inversion for fault slip over a 30 km long rupture. Your input file only has 3 GPS stations, suggesting you have very few constraints on the fault slip. For 3 observations, you can probably only constrain a 1 or 2 parameters.

I looked at your mesh and the domain is much too small for this type of problem. You want the fault to be in the middle of the domain and the edges of the domain to be several fault lengths away. For a rectangular domain, the length is usually about 2 times the width (depth). So I recommend a model about 180-300 km wide and 90-150 km deep. This should also help resolve the issues you are having with the solver. The solver struggles when the fault cuts through the entire domain with a very fine grid.

Your PETSc solver settings are not being ignored. If you look in the output/fdra_greensfns-parameters.json file you will see your values are being used. The output shows that the following defaults are being overwritten:

 -- Ignoring PETSc options (already set):
ksp_atol = 1.0e-12
ksp_error_if_not_converged = true
ksp_rtol = 1.0e-12
snes_atol = 1.0e-9
snes_error_if_not_converged = true
snes_rtol = 1.0e-12

I will improve the message to read “Ignoring default PETSc options (already set):”.

I strongly advise against changing the PETSc solver settings in this situation. Relaxing the convergence tolerance to such large values (rtol = 1.0e-5) will reduce the accuracy of the solution.

1 Like

Thanks Brad, that is super helpful! I will start working on it! Aah so, If I read correctly, the smaller cell size should work with the larger domain, and then I just increase the bias so the cells are coarser closer to the sides?

Yes, the gps stations… I am actually only interested in the stresses on the fault (I will use them for our BEM code). Can I just leave them out? I haven’t tried it yet, since I thought it would be of no influence to the stresses and strain that are calculated on the fault.

A smaller cell size should work with the larger domain, but I doubt something smaller than about 500 m is justified.

You can select whatever output you want for a Green’s function (or other simulation) that meets your needs. If you don’t have a need for output at discrete points on the surface, then don’t include them.

Hi Brad,

just one more clarification: I am not looking to invert for slip. What I want from Pylith is just the stresses on the fault, so I do not need the GPS stations as described in the examples. Is it really not possible to go to a cell size of around 20 m for that?

You do not need the GPS stations as described in the example if you are not interested in output at those locations.

You should be able to generate a mesh at whatever resolution you want. I don’t think a cell size of less than several hundred meters is scientifically meaningful.

Note: PyLith v3 does not yet support output of the traction changes on the fault. It does support output of the stresses in the domain. We are actively working on getting output of fault tractions implemented.

Hi Brad,

Thank you for your answer. I have a couple of questions:

  • What exactly do you mean by traction changes on the fault?
  • Why doesn’t the GF calculation follow the mesh from downdip to updip, but rather does the calculations in a very random pattern?
  • Why does the triangle size along the fault vary? (it is smaller at some points and larger at others, but I would expect a constant triangle size)

I have managed to get the triangle sizes of the cells down to 10m’s, which is good, however there are still some problems. As you have seen in the mesh I sent earlier, I am trying to include a seamount in my geometry. To create this, I looked at the example of the 2D subduction zone, where you are using a bspline to create the oceanic crust, and a spline to create the geometry.

I tried both the bspline and spline to approximate the seamount geometry, but I keep getting errors. The error is the ‘‘Residual norm computed by GMRES recursion’’, which, in the error section of the manual, is explained as a mesh error. However, in my case I do not get this error at the start of the simulation, but at very random points along the fault line. Sometimes it happens halfway through the calculations of the GFs, other times it errors on the last source. Looking at my mesh, I could not find any weird cells (there is an exception here when using a spline, but it is not relevant for my main question), and also the condition number is around 0.9.

So, to find out what is going wrong, I made a very simple test case. In the model I define 3 points (p5, p6, p7), that I connect via a bspline/spline curve (c_faultline), after which I split it up again into 2 separate curves (c_faultext, and c_fault). If I run this mesh in pylith, with a small cell size, it stops calculating green’s functions after 387 out of 647 sources, but, constructing the fault the way as is described in the example on the reverse fault, it runs to the end.

  • Why is this error only showing up when I want to use a curve created using bspline/spline? And what does it imply if I cannot find anything wrong with the mesh? I have included my scripts, so that you can check them if you would be interested.

testGF.zip (445.6 KB)

Thank you!

Here are answers to the easy questions (it may take me a week or more to find time to try to run your example and have an idea of why the solver is having trouble).

  1. Tractions change on the fault refers to stress changes resolved onto the fault surface. Stress on a surface corresponds to tractions. So stress changes correspond to changes in tractions.

  2. The oder of the impulses in a Green’s function calculation follows the numbering of the degrees of freedom on the fault. These can jump around as a result of the topology of your mesh, whether you reordered and mesh, and if you run in parallel.

  3. The cell (triangle) size depends on the mesh generation algorithm and what constraints you provide to it. Unstructured mesh generation algorithms are quite sophisticated and adjust the cell size to balance mesh quality and constraints on the cell size provided by the user. Most mesh generation software allows you to specify a cell size on exterior and interior boundaries. However, they may not be met exactly. For example, if you specify a cell size of 1000 m on a boundary that is 8400 m long, an algorithm would likely need to decide whether to use a uniform cell size of 1050 m or keep most cells 1000 m and use a smaller or larger cell size for the remaining cells.

I found some minor issues in the mesh generation:

  1. I would turn off the default sizing options.
  2. You have a very fine cell size at evenly spaced intervals along the fault. That is, the cell size is not uniform along the fault but varies in periodic fashion. This is a result of the default value of Sampling for the distance field. See Distance under Gmsh 4.11.0. Increasing the Sampling value to 100 results in a uniform cell size along the fault.

Here is the code excerpt with these changes:

# We turn off the default sizing methods.
gmsh.option.set_number("Mesh.MeshSizeFromPoints", 0)
gmsh.option.set_number("Mesh.MeshSizeFromCurvature", 0)
gmsh.option.set_number("Mesh.MeshSizeExtendFromBoundary", 0)

# First, we setup a field `field_distance` with the distance from the fault.
fault_distance = gmsh.model.mesh.field.add("Distance")
gmsh.model.mesh.field.setNumbers(fault_distance, "CurvesList", [self.c_fault])
gmsh.model.mesh.field.setNumber(fault_distance, "Sampling", 100)

A possible workaround for the error Residual norm computed by GMRES recursion formula 2.13611e-12 is far from the computed residual norm 5.76337e-12 at restart, residual norm at start of cycle 1.56429e-11 is to increase the GMRES restart value so that it is about 0.5 - 1.0 times the number of KSP iterations. You can set this via the command line
--petsc.ksp_gmres_restart=500 or in a .cfg file:

[pylithapp.petsc]
ksp_gmres_restart = 500

With these changes, I was able to generate a mesh with a cell size of 100m along the fault and generate Green’s functions without any errors using the files your provided.

Hi Brad,
The changes you proposed worked very well indeed, thank you!
I am currently trying to expand the mesh to include a seamount. It is here that I am running into errors again and that I have a general gmsh question about. I am using V3.03.

I have included the script for the mesh below + the mesh, spatial database and .cfg files.
With a planar fault it works great, but as soon as I implement the seamount, I get an error about the GMRES recursion formula again after a few GF calculations (so not immediatelly). Next to this error message, it also shows another warning that I have not seen before. I copied it below:

– Component ‘problem’: Computing Green’s function 49 of 667.
0 SNES Function norm 7.591628426935e-01
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Residual norm computed by GMRES recursion formula 3.56289e-09 is far from the computed residual norm 1.10299e-07 at restart, residual norm at start of cycle 7.23156e-07
[0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc!
[0]PETSC ERROR: Option left: name:-ts_error_if_step_fails (no value)
[0]PETSC ERROR: Option left: name:-ts_monitor (no value)
[0]PETSC ERROR: Option left: name:-ts_type value: beuler

All .cfg files etc. are exactly the same as for the planar case, only the mesh is different.

Then, a general question I had about gmsh:
Is it possible to start a fault somewhere in the middle of the domain? Should I add a label for fault start then?

Thank you very much!

problem.zip (196.1 KB)

If you are computing Green’s functions you can ignore the warning about PETSc TS (time stepping) options not being used. We set some basic TS options by default. When the TS is not being used, as in the case of Green’s functions, you get a warning about the options not being used.

The error about the GMRES residual norm being far from the computed residual norm is an error we have started seeing more in the past 6-9 months. The workaround is simply to increase the gmres_restart to something larger. Try ksp_gmres_restart = 500.

Hi Brad,
Thanks for your quick reply! I am at ksp_gmres_restart = 3000 already… Should I just keep increasing it?

I would think that ksp_gmres_restart = 3000 would be sufficient, because the number of KSP iterations would be much less than 3000. How many KSP iterations are used in the solve?

Hmm, but with 3000 I get the error message after a few GF calculations.

ksp iterations show up as converged_rtol iterations? Then I am around 1200 every GF calculation.

Hi Brad,

I am checking this again, and I see that the CONVERGED_RTOL iterations are 5074. It seems that with a ksp_gmres_restart = 1000 the script calculates a few GFs before it errors, but if I set ksp_gmres_restart = 3000 or higher it errors immediately at the first calculation. Is there any other way I can prevent or work around the GMRES error?

Thanks!

A quick look at the mesh suggests that the problem is likely associated with the highly distorted triangles where the fault meets the free surface. These distorted triangles will result in ill-conditioning of the Jacobian matrix cause convergence issues. I suggest changing the fault geometry locally near the up-dip end so that it reaches the free surface at a larger dip angle. Here is a schematic:


Use the black geometry rather than the red geometry in the figure above.

Let me know if fixing the fault geometry works. I talked with Matt Knepley about your error message yesterday and we probably need to dig into the solver settings with your simulation if the above fix doesn’t work.

1 Like

Hi Brad,

It seems to be working now! I will keep this in mind, thank you!