PETSc ERROR when running on cluster

Hi all,
I have compiled Pylith successfully from source following the INSTALL instructions.
I attempted to verify my installation by running a dynamic rupture simulation which has been simulated successfully on another cluster. But the program was aborted and gave me this message:

[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
[0]PETSC ERROR: Petsc has generated inconsistent data
[0]PETSC ERROR: There is a size mismatch in the SF embedding, 48 != 744

I requested 48 cores on 3 nodes using a SLURM batch system.
What could be the reason here?

Thank you!

Thank you Charles! I have attached the outfile here.
errorfile.txt (3.1 KB)
outfile.txt (4.7 KB)

Ge

Dear Ge,

It would be helpful if we could see the entire output (stderr and stdout).

Cheers,

Charles

Charles Williams I Geodynamic Modeler
GNS Science I Te
Pῡ Ao

1 Fairway Drive, Avalon 5010, PO Box 30368, Lower Hutt 5040, New Zealand

Ph 0064-4-570-4566 I Mob 0064-22-350-7326 I Fax 0064-4-570-4600 http://www.gns.cri.nz/ I **Email: **C.Williams@gns.cri.nz

1 Like

Charles, thanks for reminding me! I have attached the file in the original post.

Dear Ge,

Sorry for the slow response. I’m still not sure what your problem might be. This appears to be happening during the partitioning process. I don’t think this is the problem, but there could be a memory issue. How large is this problem (e.g., mesh size), and how much memory does your cluster have per node? Also, are you running the same version of PyLith that used for the successful run on the other cluster?

Cheers,
Charles

This is probably a PETSc bug. You can report it to petsc-maint@mcs.anl.gov