PETSc Error in Poroelastic Simulation with PyLith 4.2.0

Dear Baagaard,

I hope this message finds you well.

I am currently running a poroelastic simulation using PyLith 4.2.0. The input model file is approximately 38 MB in size. However, the simulation fails at the beginning of the time stepping and produces the following PETSc error:
[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: 2147483818 is too big for PetscInt, you may need to ./configure using --with-64-bit-indices.
Could you please advise why this error occurs and how I can resolve it?
For reference, here is the full error message:

/public1/home/a8s000279/pylith-4.2.0-linux-x86_64/lib/python3.12/site-packages/pylith/problems/TimeDependent.py:132:run
– timedependent(info)
– Solving problem.
0 TS dt 1742. time 0.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: 2147483818 is too big for PetscInt, you may need to ./configure using --with-64-bit-indices
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!

Fatal error. Calling MPI_Abort() to abort PyLith application.

RuntimeError: Error detected while in PETSc function.
Thank you very much for your time and assistance.
Best regards,
Xiaoyang

It would help if you could provide more information, including the entire output after Solving problem.

Are you running in serial or parallel? How many cells and vertices are in the input mesh? Does the same problem (all parameters exactly the same) run with a coarser mesh?

All the files I used are included in the attachments below. As the model requires a relatively precise stress distribution on the fault, further simplification may not be feasible. The file named error.txt contains the error message I encountered; I would be very grateful if you could kindly take a look at it.The model file exceeds 10 MB, so the upload was unsuccessful.
file.zip (35.6 KB)

The PETSc stack trace indicates the error is coming during preallocation of the sparse matrix. This could mean the 64-bit indices are needed or there is some other problem.

Are you running in serial or parallel? How many cells and vertices are in the input mesh?

I am not suggesting that you simplify the mesh or problem for your final run, but checking if a coarse mesh produces the same error can help identify source of the error. For example, if a much coarser mesh produces the same error, then the error is probably related to problem setup. If the coarse mesh does not produce the same error, then it could be related to the mesh or the size of the mesh.

At the moment, I am using single-core serial computation, as my previous attempts at running the model in parallel resulted in no error messages, but the computation did not progress.The computation shows a total of 249,899 vertices (nodes) and 1,369,327 cells (elements), distributed across 21 mesh blocks.

I noticed that this a simulation with poroelasticity and faults. We worked on better solver settings for this case at the PyLith hackathon last month. For the solver settings to work, they require an updated version of PETSc. I will try to find time next week to finalize those updates and put together a bugfix release.

In the meantime, the only workarounds I know are to remove the faults or use a much coarser mesh.

Thank you very much for your kind assistance and prompt response. I sincerely look forward to your further updates.

Dear baagaard,

Over the past two weeks, I have attempted various adjustments, but unfortunately the same error still remains. I was therefore wondering whether the forthcoming version of your program might help resolve the simulations of poroelastic models requiring large storage space. My sincere apologies for troubling you once again, but may I kindly ask when you expect to release the updated version? We are very much looking forward to your update.

Wishing you continued success, and thank you once again for your kind support.

Best regards,
Xiaoyang

I am finalizing the bugfix release and anticipate getting it out this week. If everything goes smoothly, I hope to have the release files posted on Wednesday.

Thank you very much for your reply! I look forward to the new release and wish you all the best!

We have fixed the solver specification issues for poroelasticity with a fault in PyLith v4.2.1.

Thank you very much for your update and reminder! I have tried many further tests since then, and I found that when using parallel computation, the memory-related errors no longer occur, and for a model size of 13 MB, the calculation converges very quickly. However, when I use a finer mesh (e.g., a model size of 35 MB), the computation takes a very long time, constantly showing:

>> /public1/home/a8s000279/pylith-4.2.1-linux-x86_64/lib/python3.12/site-packages/pylith/problems/TimeDependent.py:132:run
 -- timedependent(info)
 -- Solving problem.
0 TS dt 1742. time 0.
    0 SNES Function norm 2.235523221572e-05

I am not sure whether the calculation has entered a deadlock bug or is still running slowly, so I would like to consult you about this situation. In addition, when I include fault surface analysis in the 13 MB model and change solution = pylith.problems.SolnDispPresTracStrainVelPdotTdot to solution = pylith.problems.SolnDispPresTracStrainVelPdotTdotLagrange, the same issue occurs. I am uncertain whether the computation is still running normally in these two cases.

There are several important notes regarding the solver settings in

When running in parallel, the current variable point block Jacobi preconditioner will be very poor if the fault is aligned with the split in the domain among processes. We try to avoid this by penalizing this behavior in the partitioner, but that does not always prevent this behavior. This will be resolved in the v5.0.0 release when we build the cohesive cells in parallel. The current workaround is to run in serial.

Start by running the simulation with the finer mesh in serial without state variables. Does the solution converge in a reasonable amount of time?

Thank you very much for your help. I just checked, and although the computation was very slow, it has already completed. I will continue to try simulations that include the fault surface.If run in serial, large models may directly cause segmentation faults or the errors I mentioned above.

Do the large models result in segmentation faults or other errors with PyLith v4.2.1?

Yes, I suspect that the error may be caused by the fact that a single CPU core in my system cannot handle such a large memory requirement, but I am not entirely sure since I currently do not have access to another computational server for comparison. Additionally, I would like to ask you about an issue in my configuration file: after adding

[pylithapp.petsc]

Turn on TS, KSP, and SNES monitors

ts_monitor = True
ksp_monitor = True
snes_monitor = True
ksp_converged_reason = True
snes_converged_reason = True

Trigger error if linear or nonlinear solvers fail to converge

ts_error_if_step_fails = True
ksp_error_if_not_converged = True
snes_error_if_not_converged = True

the simulation fails to run. Even if I only keep ts_monitor = True, the error still occurs. I would like to know how I can estimate the computation time for a single run. The error messages I receive include:

[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: No support for this operation for this object type
[0]PETSC ERROR: Unsupported viewer True
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR: Option left: name:-fieldsplit_displacement_ksp_type value: gmres source: code
[0]PETSC ERROR: Option left: name:-fieldsplit_displacement_pc_type value: ml source: code
[0]PETSC ERROR: Option left: name:-fieldsplit_pressure_pc_type value: bjacobi source: code

The interface to the PETsc option ts_monitor has changed. Setting ts_monitor= (without True) is the new way to enable it.

Note: All of these monitors should be turned on by default except ksp_monitor.

Thank you very much for your kind help. By the same token, would the others also be in the form of ksp_monitor = and snes_monitor =?

Yes, the new interface has been standardized across all three monitors (ts, snes, and ksp).

Thank you very much once again for your generous help!