Erro Encountered While Running fastscape_test_open_ghost_nodes.prm in ASPECT-Fastscape Coupling

Hello everyone,

I hope this message finds you well. I am currently working on coupling ASPECT with Fastscape and encountered an error while running the test file fastscape_test_open_ghost_nodes.prm. Could you kindly help me understand what might be causing this error?


*** Timestep 1:  t=20000 years, dt=20000 years
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = double; _Alloc = std::allocator<double>; reference = double&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
SIGABRT received
--------------------------------------------------------------------------
prterun detected that one or more processes exited with non-zero status,
thus causing the job to be terminated. The first process to do so was:

   Process name: [prterun-fedora-4751@1,1] Exit code:    1

log.txt (11.6 KB)

Thank you very much for your time and assistance!

Best regards,

Yuan

@YUAN The error indicates that some part of the program is accessing an element of a vector that lies beyond the end of the vector. Since we don’t seem to get this error on any of our platforms, let me ask you a few questions:

  • Because you get the error multiple times, I assume that you are running the program in parallel, right? Do you still get the error if you run with just one process?
  • Are you running in debug mode? If not, what happens if you do?

Best

W.

Thank you for your response. I followed your guidance and conducted the test:

1.When running on a single core, the following error occurred

*** Timestep 1:  t=20000 years, dt=20000 years
   Initializing FastScape... 3 levels, cell size: 5000 m.
   Executing FastScape... 4 timesteps of 5000 years.
      Writing initial VTK...
[fedora:11762] *** Process received signal ***
[fedora:11762] Signal: Floating point exception (8)
[fedora:11762] Signal code: Invalid floating point operation (7)
[fedora:11762] Failing at address: 0x7fc3af9dfd16
[fedora:11762] [ 0] /lib64/libc.so.6(+0x1a050) [0x7fc3a0427050]
[fedora:11762] [ 1] /home/yuan/fem2/build/libfastscapelib_fortran.so(streampowerlaw_+0x1fac) [0x7fc3af9dfd16]
[fedora:11762] [ 2] /home/yuan/fem2/build/libfastscapelib_fortran.so(fastscape_execute_step_+0x137) [0x7fc3af9ba19d]
[fedora:11762] [ 3] /home/yuan/fem2/aspect/build/aspect(_ZNK6aspect15MeshDeformation9FastScapeILi2EE17execute_fastscapeERSt6vectorIdSaIdEES6_S6_S6_S6_S6_RKdRKj+0x255) [0x184dae7]
[fedora:11762] [ 4] /home/yuan/fem2/aspect/build/aspect(_ZNK6aspect15MeshDeformation9FastScapeILi2EE40compute_velocity_constraints_on_boundaryERKN6dealii10DoFHandlerILi2ELi2EEERNS3_17AffineConstraintsIdEERKSt3setIjSt4lessIjESaIjEE+0xf6f) [0x183e7ff]
[fedora:11762] [ 5] /home/yuan/fem2/aspect/build/aspect(_ZN6aspect15MeshDeformation22MeshDeformationHandlerILi2EE16make_constraintsEv+0x604) [0x186d008]
[fedora:11762] [ 6] /home/yuan/fem2/aspect/build/aspect(_ZN6aspect15MeshDeformation22MeshDeformationHandlerILi2EE7executeEv+0x12b) [0x18694e1]
[fedora:11762] [ 7] /home/yuan/fem2/aspect/build/aspect(_ZN6aspect9SimulatorILi2EE14solve_timestepEv+0x57) [0x10ba7af]
[fedora:11762] [ 8] /home/yuan/fem2/aspect/build/aspect(_ZN6aspect9SimulatorILi2EE3runEv+0x677) [0x10bb329]
[fedora:11762] [ 9] /home/yuan/fem2/aspect/build/aspect() [0x200f856]
[fedora:11762] [10] /home/yuan/fem2/aspect/build/aspect(main+0x5e6) [0x1f34a5f]
[fedora:11762] [11] /lib64/libc.so.6(+0x3248) [0x7fc3a0410248]
[fedora:11762] [12] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc3a041030b]
[fedora:11762] [13] /home/yuan/fem2/aspect/build/aspect(_start+0x25) [0x409b25]
[fedora:11762] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 11762 on node fedora exited on
signal 8 (Floating point exception).

2.In aspect-release mode, it ran normally.

3.In aspect-debug mode, the following error occurred.

** Timestep 1:  t=20000 years, dt=20000 years
   Initializing FastScape... 3 levels, cell size: 5000 m.
   Executing FastScape... 4 timesteps of 5000 years.
      Writing initial VTK...
[fedora:12042] *** Process received signal ***
[fedora:12042] Signal: Floating point exception (8)
[fedora:12042] Signal code: Invalid floating point operation (7)
[fedora:12042] Failing at address: 0x7f41629dfd16
[fedora:12042] [ 0] /lib64/libc.so.6(+0x1a050) [0x7f4153427050]
[fedora:12042] [ 1] /home/yuan/fem2/build/libfastscapelib_fortran.so(streampowerlaw_+0x1fac) [0x7f41629dfd16]
[fedora:12042] [ 2] /home/yuan/fem2/build/libfastscapelib_fortran.so(fastscape_execute_step_+0x137) [0x7f41629ba19d]
[fedora:12042] [ 3] /home/yuan/fem2/aspect/build/aspect-debug(_ZNK6aspect15MeshDeformation9FastScapeILi2EE17execute_fastscapeERSt6vectorIdSaIdEES6_S6_S6_S6_S6_RKdRKj+0x255) [0x184dae7]
[fedora:12042] [ 4] /home/yuan/fem2/aspect/build/aspect-debug(_ZNK6aspect15MeshDeformation9FastScapeILi2EE40compute_velocity_constraints_on_boundaryERKN6dealii10DoFHandlerILi2ELi2EEERNS3_17AffineConstraintsIdEERKSt3setIjSt4lessIjESaIjEE+0xf6f) [0x183e7ff]
[fedora:12042] [ 5] /home/yuan/fem2/aspect/build/aspect-debug(_ZN6aspect15MeshDeformation22MeshDeformationHandlerILi2EE16make_constraintsEv+0x604) [0x186d008]
[fedora:12042] [ 6] /home/yuan/fem2/aspect/build/aspect-debug(_ZN6aspect15MeshDeformation22MeshDeformationHandlerILi2EE7executeEv+0x12b) [0x18694e1]
[fedora:12042] [ 7] /home/yuan/fem2/aspect/build/aspect-debug(_ZN6aspect9SimulatorILi2EE14solve_timestepEv+0x57) [0x10ba7af]
[fedora:12042] [ 8] /home/yuan/fem2/aspect/build/aspect-debug(_ZN6aspect9SimulatorILi2EE3runEv+0x677) [0x10bb329]
[fedora:12042] [ 9] /home/yuan/fem2/aspect/build/aspect-debug() [0x200f856]
[fedora:12042] [10] /home/yuan/fem2/aspect/build/aspect-debug(main+0x5e6) [0x1f34a5f]
[fedora:12042] [11] /lib64/libc.so.6(+0x3248) [0x7f4153410248]
[fedora:12042] [12] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7f415341030b]
[fedora:12042] [13] /home/yuan/fem2/aspect/build/aspect-debug(_start+0x25) [0x409b25]
[fedora:12042] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 12042 on node fedora exited on
signal 8 (Floating point exception).

1 Like

@YUAN Good detective work already! You now know that the problem is in the fastscape code, in function streampower. The next step is figuring out what the problem is. Can you try to run the single-threaded, debug mode program in a debugger such as gdb or lldb? It should stop at the line in question, and you can then look around at the specific code where the issue happens. Perhaps that helps us understand how the problem comes about.

Best

W.