Problem running continental extension cookbook on recently installed version of ASPECT

Lukel · February 27, 2024, 2:57pm

Hi,

Recently I’ve been working with the ARCHER2 help desk and tried installing the latest version of ASPECT on the ARCHER2 supercomputer (https://www.archer2.ac.uk/), but I’m having a problem with the latest version of ASPECT (ver. 2.6-pre, f75d95afe).

Firstly, I tried installing ASPECT 2.5 (bafd9df3e) from GitHub - geodynamics/aspect at aspect-2.5. Long story short, after a lot of back and forth with the helpdesk I managed to get this version of ASPECT working, yay. Attached is the working instructions on how to install ASPECT (ver. 2.5, bafd9df3e) on ARCHER2 (without LAPACK or BLAS).

Subsuquently, I tried updating to the main branch (2.6-pre, f75d95afe when I last tried), and after finding out how to install SUNDIALS, the installation process went fine with no error messages (again attached is the installation instructions I used). However, when I started running a handful of cookbook models it became quite noticeable that the models I ran were taking significantly longer than they should do. To test what was wrong I ran the continental extension cookbook model on the working version of 2.5 and on this recently installed version of 2.6-pre until 10 Myr. Overall, both models look relatively similar (images below) so nothing drastically unexpected happened, but ver. 2.5 took ~21 minutes while 2.6-pre took ~250 minutes on one node (128 cores). After looking at the postprocessing it became apparent that something wan’t right (highlighted in bold is the key differences between the two models):

| Assemble Stokes system | Assemble Stokes system Picard | Assemble Stokes system rhs | Assemble composition system | Assemble temperature system | Build Stokes preconditioner | Build composition preconditioner | | Build temperature preconditioner | | Initialization | Mesh deformation | Mesh deformation initialize | Postprocessing | Refine mesh structure, part 1 | Refine mesh structure, part 2 | Setup dof systems | Setup initial conditions | Setup matrices | Solve Stokes system | Solve composition system | Solve temperature system | 502 | 9.24s | 0.71% |
| 580 | 10.6s | 0.82% |
| 78 | 1.23s | 0% |
| 2510 | 46.9s | 3.6% |
| 502 | 12.4s | 0.95% |
| 580 | 21.7s | 1.7% |
2508 | 2.12s | 0.16% |
502 | 0.436s | 0% |
| 1 | 1.1s | 0% |
| 502 | 17.1s | 1.3% |
| 2 | 1.58s | 0.12% |
| 501 | 35.8s | 2.8% |
| 1 | 0.0429s | 0% |
| 1 | 0.056s | 0% |
| 2 | 1.98s | 0.15% |
| 2 | 0.294s | 0% |
| 502 | 64.5s | 5% |
| 580 | 1.05e+03s | 81% |
| 2508 | 15.5s | 1.2% |
| 502 | 2.99s | 0.23% |

| Assemble Stokes system | 502 | 8.91s | 0% |
| Assemble Stokes system Picard | 3998 | 71.9s | 0.48% |
| Assemble Stokes system rhs | 3496 | 59.1s | 0.39% |
| Assemble composition system | 2510 | 61.2s | 0.41% |
| Assemble temperature system | 502 | 12.5s | 0% |
| Build Stokes preconditioner | 3998 | 147s | 0.98% |
| Build composition preconditioner | 2508 | 2.03s | 0% |
| Build temperature preconditioner | 502 | 0.411s | 0% |
| Initialization | 1 | 1.07s | 0% |
| Mesh deformation | 502 | 17.8s | 0.12% |
| Mesh deformation initialize | 2 | 1.21s | 0% |
| Postprocessing | 501 | 43.5s | 0.29% |
| Refine mesh structure, part 1 | 1 | 0.0369s | 0% |
| Refine mesh structure, part 2 | 1 | 0.0441s | 0% |
| Setup dof systems | 2 | 1.58s | 0% |
| Setup initial conditions | 2 | 0.164s | 0% |
| Setup matrices | 2 | 0.264s | 0% |
| Solve Stokes system | 3998 | 1.46e+04s | 97% |
| Solve composition system | 2508 | 15.5s | 0.1% |
| Solve temperature system | 502 | 2.93s | 0% |
±---------------------------------±----------±-----------±-----------+
Full outputs are attached as well.

My main question here is what’s causing this to happen? Also, is there any fix to this? I noticed that the main branch lists SUNDIALS as a requirement now so is it setup correctly? Since ARCHER2 is a Cray system we’re avoiding installing and configuring LAPACK and BLAS, could not having LAPACK and BLAS be another potential issue here?

Any help is much appreciated.

Thanks,

Luke

installing_aspect_archer2.pdf (90.9 KB)
slurm_ver.2.6.txt (2.6 MB)
slurm_ver.2.5.txt (1.7 MB)

bangerth · February 28, 2024, 9:11pm

Luke:
The difference between the two logs is that in the 2.5 version, ASPECT only ever runs a single nonlinear Stokes iteration per time step after the first time step, whereas in the 2.6 version it is on average ~8. Because the Stokes assembly+solver is the most expensive part of the overall run time, this explains the ~10x slow down.

I don’t know whether the issue with the number of nonlinear iterations is because of a difference in input file, because we fixed a bug, because we introduced a bug, because we changed the default for a parameter you do not explicitly list in the input file, or something else. But that is the starting point for where you need to look.

Best
W.

Lukel · March 1, 2024, 1:23pm

Hi,

Thanks a lot for the quick response! I figured there was a problem with the number of stokes iterations, thanks for the info and confirming it.

I ran a diff on the input files I used and there’s no difference between the two models I ran, just the version of ASPECT that was used. Likewise, the only difference between the models I ran and the continental extension cookbook model was the name of the output folder and the end time:

I also attached the .prm files.

I’m not sure what else I should look to try so if you have any ideas or troubleshooting tips let me know.

Thanks again,

Luke

FYI: continental_extension.prm = continental extension cookbook model
cookbook_extension_model_2 = continental extension cookbook model ran until 10 Myr on ver. 2.6
cookbook_extension_model_3 = continental extension cookbook model ran until 10 Myr on ver. 2.5

cookbook_extension_model_2.prm (13.5 KB)
continental_extension.prm (13.5 KB)
cookbook_extension_model_3.prm (13.5 KB)

jbnaliboff · March 1, 2024, 9:55pm

Hi Luke,

That is really strange behavior, and I’m not sure what could be causing the difference.

Here are some options for how to proceed with testing to diagnose the issue:

Modify the PRM file to have an end time of 0 and only allow for a maximum of 1 nonlinear iteration. This should run quickly and then you can see if there is any noticeable difference in the wall clock time for the Stokes solver.
Instead of the defect correction Stokes, try using just regular picard iterations for the nonlinear solver: set Nonlinear solver scheme = single Advection, iterated Stokes. This test should pinpoint if there is a difference with the defect correction Stokes solver between the two versions.

Do you by chance also have a access to a separate computer that you can run these and prior tests on with the same version of ASPECT to confirm if this finding is reproducible across different OS/machines?

Cheers,
John

Lukel · March 6, 2024, 4:06pm

Hi,

Thanks for the advice! I’ve had a bit of a play with the parameters you suggested to see if I can find out what’s causing the problem:

I first ran a model (on ver. 2.5 and 2.6) where the end time = 0 and then another model where the end time = 0 and Max nonlinear iterations = 1. The end time = 0 model had a very similar wallclock time on both 2.5 (62s total) and 2.6 (63.6s total) with an identicle number of calls. Likewise, the end time = 0 and Max nonlinear iterations = 1 model had a very similar wallclock time on 2.5 (18.9s total) and 2.6 (10s total) with an identicle number of calls. Attached are also the full slurm outputs for all the models I talk about in this post.
I next changed the Stokes solver from single Advection, iterated defect correction Stokes to single Advection, iterated Stokes. The iterated Stokes model took much less time to run compared to using the iterated defect correction Stokes with somewhat similar runtimes to ASPECT ver. 2.5 (966s on 2.6 and 1025s on 2.5) and gave very similar looking outputs. It seems here that the main problem looks to be using iterated defect correction Stokes as I’ve had no major differences using the other solver.

Version 2.6 - Continental extension cookbook model changed to using single Advection, iterated Stokes:
±---------------------------------------------±-----------±-----------+
| Total wallclock time elapsed since start | 966s | |
| | | |
| Section | no. calls | wall time | precent of total |
±---------------------------------±----------±-----------±-----------+
| Assemble Stokes system | 575 | 10.3s | 1.1% |
| Assemble composition system | 1260 | 31s | 3.2% |
| Assemble temperature system | 252 | 6.66s | 0.69% |
| Build Stokes preconditioner | 575 | 20.6s | 2.1% |
| Build composition preconditioner | 1258 | 1.05s | 0.11% |
| Build temperature preconditioner | 252 | 0.256s | 0% |
| Initialization | 1 | 1.11s | 0.11% |
| Mesh deformation | 252 | 9.15s | 0.95% |
| Mesh deformation initialize | 2 | 4.8s | 0.5% |
| Postprocessing | 251 | 71.2s | 7.4% |
| Refine mesh structure, part 1 | 1 | 0.166s | 0% |
| Refine mesh structure, part 2 | 1 | 0.134s | 0% |
| Setup dof systems | 2 | 5.69s | 0.59% |
| Setup initial conditions | 2 | 0.389s | 0% |
| Setup matrices | 2 | 0.315s | 0% |
| Solve Stokes system | 575 | 795s | 82% |
| Solve composition system | 1258 | 7.58s | 0.78% |
| Solve temperature system | 252 | 1.47s | 0.15% |
±---------------------------------±----------±-----------±-----------+

Version 2.5 - Continental extension cookbook model changed to using single Advection, iterated Stokes:
±---------------------------------------------±-----------±-----------+
| Total wallclock time elapsed since start | 1.03e+03s | |
| | | |
| Section | no. calls | wall time | percent of total |
±---------------------------------±----------±-----------±-----------+
| Assemble Stokes system | 548 | 9.97s | 0.97% |
| Assemble composition system | 1260 | 22.9s | 2.2% |
| Assemble temperature system | 252 | 4.88s | 0.48% |
| Build Stokes preconditioner | 548 | 18.5s | 1.8% |
| Build composition preconditioner | 1258 | 1s | 0% |
| Build temperature preconditioner | 252 | 0.265s | 0% |
| Initialization | 1 | 1.47s | 0.14% |
| Mesh deformation | 252 | 8.92s | 0.87% |
| Mesh deformation initialize | 2 | 2.49s | 0.24% |
| Postprocessing | 251 | 79.6s | 7.8% |
| Refine mesh structure, part 1 | 1 | 0.224s | 0% |
| Refine mesh structure, part 2 | 1 | 0.0688s | 0% |
| Setup dof systems | 2 | 3.11s | 0.3% |
| Setup initial conditions | 2 | 0.336s | 0% |
| Setup matrices | 2 | 0.302s | 0% |
| Solve Stokes system | 548 | 860s | 84% |
| Solve composition system | 1258 | 7.78s | 0.76% |
| Solve temperature system | 252 | 1.48s | 0.14% |
±---------------------------------±----------±-----------±-----------+

Unfortunatly, it’s a bit of a pain, but I don’t have a seperate computer that I can readily install ASPECT on. My original post did include information on how I installed ASPECT as well as the system I installed it on if that’s of any help.

For now I’m still working on my project as normal, I’m just avoiding using the single Advection, iterated defect correction Stokes while I work on adding/chaning parameters. If you want me to try anymore tests I can or if you have any idea what’s causing ASPECT to act like this let me know.

Thanks,

Luke

slurm-2.6_End time = 0.txt (30.4 KB)
slurm-2.5_End time = 0.txt (30.1 KB)
slurm-2.6_End time = 0 - Max nonlinear iterations = 1.txt (8.3 KB)
slurm-2.5_End time = 0 - Max nonlinear iterations = 1.txt (8.1 KB)
slurm-2.6_single Advection, iterated Stokes.txt (880.2 KB)
slurm-2.5_single Advection, iterated Stokes.txt (875.1 KB)

jbnaliboff · March 6, 2024, 5:52pm

Hi Luke,

Thanks for running these additional tests. As a side note, 128 processors is a bit too much for this size problem (100-200K DOF), and the models will probably run faster using 8 or 16 nodes. My recollection is that for 2D you never want to go below 10K DOF per processor (or something along those lines).

Interestingly, aside from minor variations I am not seeing any significant difference in the model outputs between the two versions (2.5 versus 2.6) when using iterated defect correction Stokes or iterated Stokes.

For example, the same number the files slurm-2.6_End time = 0.txt and slurm-2.5_End time = 0.txt take the same number of nonlinear iterations.

I am recalling correctly that after the first time step is when the number of nonlinear iterations between v 2.5 and 2.6 when using defect correction Stokes diverges?

The iterated Stokes model took much less time to run compared to using the iterated defect correction Stokes with somewhat similar runtimes to ASPECT

If you want me to try anymore tests I can or if you have any idea what’s causing ASPECT to act like this let me know.

Is this comparing to the tests you ran in the previous post to the full end time? I confess I am bit confused now, as the tests you just posted for a single time step (End time = 0) actually don’t show much variation between version 2.5 and 2.6 for defection Correction Stokes. Would you be willing to run the models using iterated Stokes in combination with an end time of 0 so we can do another round of comparisons?

For now I’m still working on my project as normal, I’m just avoiding using the single Advection, iterated defect correction Stokes while I work on adding/chaning parameters

I think that is a good plan for now. There is an open issue for the use of defect correction Stokes with the GMG solver, but you are using AMG here (is that correct?).

Are you by chance free next Monday from 11 am - 12 pm Pacific for the regular ASPECT user meeting? That would be a good opportunity to discuss these results live.

@MFraters - This may be of interest.

Thanks!

Cheers,
John

Lukel · March 7, 2024, 3:06pm

Hi again,

Now I have a better understanding of what the problem is to avoid any confusion I’ve created a little table outlining each test I ran and also all the output slurm files when using the continental extension cookbook model.

Interesting how the issue you linked didn’t have any problems with the AMG solver and single Advection, iterated defect correction Stokes as that’s the problem I seem to be having. Since I don’t have LAPACK and BLAS installed I don’t believe I can use the GMG solver so have been using the AMG solver.

Hopefully this summarises the problems I’m having a little better. I’ll also have a look at pushing this issue at the next ASPECT meeting. I assume joining is just opening the zoom link on the Regular User Meeting pinned topic?

Big thanks again,

Luke

slurm-2.5.txt (876.4 KB)
slurm-2.5_End time = 0 - Max nonlinear iterations = 1.txt (8.1 KB)
slurm-2.5_End time = 0.txt (30.1 KB)
slurm-2.5_End time = 10e6.txt (1.7 MB)
slurm-2.5_single Advection, iterated Stokes - End time = 0.txt (21.2 KB)
slurm-2.5_single Advection, iterated Stokes.txt (875.1 KB)
slurm-2.6.txt (378.8 KB)
slurm-2.6_End time = 0 - Max nonlinear iterations = 1.txt (8.3 KB)
slurm-2.6_End time = 0.txt (30.4 KB)
slurm-2.6_End time = 10e6.txt (2.6 MB)
slurm-2.6_single Advection, iterated Stokes - End time = 0.txt (21.4 KB)
slurm-2.6_single Advection, iterated Stokes.txt (880.2 KB)

jbnaliboff · March 7, 2024, 8:31pm

Hi Luke,

Thanks for the summary of all the tests and results via the table, that is incredibly helpful.

Very odd that the issue between two versions so far only occurs after the first time step with iterated defect correction Stokes.

If you have a chance, can you add a post to the aforementioned issue summarizing your findings? The underlying problems may or may not be related, but I think it makes sense to add to that issue initially.

On my end, I will try the continental extension cookbooks with the two different versions and iterated defect correction Stokes on a local computer sometime in the coming days, to see if I can reproduce the issue.

Indeed, to join just open the zoom link from that pinned topic.

Thanks again for pointing out this issue and conducting the additional tests.

Cheers,
John

jbnaliboff · March 19, 2024, 8:39pm

Hi Luke,

I finally had a chance to run a few tests, and can confirm similar differences in the iterated defect correction Stokes solver behavior between the two versions (2.5, 2.6-pre) when building/running ASPECT on a standard linux workstation. In detail, in version 2.5 only 1 nonlinear iteration is required after the first time step, while in version 2.6-pre quite a few nonlinear iterations are required (I capped it at 10).

Similarly, using iterated Stokes only produces very minor variations between the two versions (exact nonlinear residuals, etc)

I propose we proceed as follows:

For now, continue to use iterated Stokes instead iterated defect correction Stokes (I recommend others due this as well for similar classes of models)
We move this discussion over to a new github issue, where I will summarize the issue and my new test results.
We come back to this forum post after discussion on the github issue.

Does this sound like a reasonable path forward?

Cheers,
John

Lukel · March 20, 2024, 10:23am

Hi John,

Thanks for running the tests!

When I first came across this issue I assumed it was going to be another problem with my ASPECT installation, so from my point of view this is good news as I don’t have to rebuild ASPECT for what feels like the 100th time trying to find out what’s wrong.

Yes if you could update this post when the bugs fixed that would be great, thanks.

I’ve also attached a slightly updated guide to installing ASPECT on ARCHER2 if you want to add it to the ASPECT wiki.

Big thanks again for all the help you’ve been ace,

Luke

installing_aspect_archer2.pdf (92.3 KB)

jbnaliboff · June 4, 2024, 6:22pm

@Lukel - FYI, this issue is being discussed here. To summarize, @tjhei ran git bisect to identify what pull request resulted in the change, but so far I don’t see any obvious bugs that were introduced in the identified PR. I will update this post when there is additional relevant information to report.

cgw0814 · July 3, 2024, 7:23am

Hi~

I just ran this cookbook after updating the ASPECT from repository.

I also test from the newest version of the docker container.

It seems like a problem on the solver converge exists.

And the program forces to stop from error message:

Blockquote
The linear solver tolerance is set to 0.01.
Solving Stokes system… 0±--------------------------------------------------------
TimerOutput objects finalize timed values printed to the
screen by communicating over MPI in their destructors.
Since an exception is currently uncaught, this
synchronization (and subsequent output) will be skipped
to avoid a possible deadlock.
Exception ‘ExcMessage (exception_message.str())’ on rank 0 on processing:
--------------------------------------------------------
An error occurred in line <3035> of file </home/dealii/aspect/source/utilities.cc> in function
void aspect::Utilities::throw_linear_solver_failure_exception(const string&, const string&, const std::vectordealii::SolverControl&, const std::exception&, MPI_Comm, const string&)
The violated condition was:
false
Additional information:
The iterative Stokes solver in
StokesMatrixFreeHandlerImplementation::solve did not converge.

    The initial residual was: 9.063574e+12
    The final residual is: 9.063574e+12
    The required residual for convergence is: 9.063574e+10
    See output-continental_extension/solver_history.txt for the full
    convergence history.

    The solver reported the following error:

    --------------------------------------------------------
    An error occurred in line <3035> of file
    </home/dealii/aspect/source/utilities.cc> in function
    void aspect::Utilities::throw_linear_solver_failure_exception(const
    string&, const string&, const std::vector<dealii::SolverControl>&,
    const std::exception&, MPI_Comm, const string&)
    The violated condition was:
    false
    Additional information:
    The iterative (bottom right) solver in
    BlockSchurGMGPreconditioner::vmult did not converge.

    The initial residual was: 1.182574e-02
    The final residual is: 2.005318e-06
    The required residual for convergence is: 1.182574e-08

    The solver reported the following error:

    --------------------------------------------------------
    An error occurred in line <1337> of file
    </usr/include/deal.II/lac/solver_cg.h> in function
    void dealii::SolverCG<VectorType>::solve(const MatrixType&,
    VectorType&, const VectorType&, const PreconditionerType&) [with
    MatrixType = aspect::MatrixFreeStokesOperators::MassMatrixOperator<2,
    1, double>; PreconditionerType = dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double>,
    dealii::MGTransferMatrixFree<2, double> >; VectorType =
    dealii::LinearAlgebra::distributed::Vector<double>]
    The violated condition was:
    solver_state == SolverControl::success
    Additional information:
    Iterative method reported convergence failure in step 100. The
    residual in the last step was 2.00532e-06.

    This error message can indicate that you have simply not allowed a
    sufficiently large number of iterations for your iterative solver to
    converge. This often happens when you increase the size of your
    problem. In such cases, the last residual will likely still be very
    small, and you can make the error go away by increasing the allowed
    number of iterations when setting up the SolverControl object that
    determines the maximal number of iterations you allow.

    The other situation where this error may occur is when your matrix is
    not invertible (e.g., your matrix has a null-space), or if you try to
    apply the wrong solver to a matrix (e.g., using CG for a matrix that
    is not symmetric or not positive definite). In these cases, the
    residual in the last iteration is likely going to be large.

    Stacktrace:
    -----------
    #0  aspect-release: void
    dealii::SolverCG<dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>
    >::solve<aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    >(aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1, double>
    const&, dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>&,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host> const&, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    const&)
    #1  aspect-release:
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,

    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    >::vmult(dealii::LinearAlgebra::distributed::BlockVector<double>&,
    dealii::LinearAlgebra::distributed::BlockVector<double> const&) const
    #2  aspect-release: void
    dealii::SolverFGMRES<dealii::LinearAlgebra::distributed::BlockVector<double>

    >::solve<aspect::MatrixFreeStokesOperators::StokesOperator<2, 2,
    double>,
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,

    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    > >(aspect::MatrixFreeStokesOperators::StokesOperator<2, 2, double>
    const&, dealii::LinearAlgebra::distributed::BlockVector<double>&,
    dealii::LinearAlgebra::distributed::BlockVector<double> const&,
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,

    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    > const&)
    #3  aspect-release: aspect::StokesMatrixFreeHandlerImplementation<2,
    2>::solve()
    #4  aspect-release: aspect::Simulator<2>::solve_stokes()
    #5  aspect-release:
    aspect::Simulator<2>::do_one_defect_correction_Stokes_step(aspect::DefectCorrectionResiduals&,

    bool)
    #6  aspect-release:
    aspect::Simulator<2>::solve_single_advection_and_iterated_newton_stokes(bool)
#7

    aspect-release: aspect::Simulator<2>::solve_timestep()
    #8  aspect-release: aspect::Simulator<2>::run()
    #9  aspect-release: void
    run_simulator<2>(std::__cxx11::basic_string<char,
    std::char_traits<char>, std::allocator<char> > const&,
    std::__cxx11::basic_string<char, std::char_traits<char>,
    std::allocator<char> > const&, bool, bool, bool, bool)
    #10  aspect-release: main
    --------------------------------------------------------


    Stacktrace:
    -----------
    #0  aspect-release:
    aspect::Utilities::throw_linear_solver_failure_exception(std::__cxx11::basic_string<char,
    std::char_traits<char>, std::allocator<char> > const&,
    std::__cxx11::basic_string<char, std::char_traits<char>,
    std::allocator<char> > const&, std::vector<dealii::SolverControl,
    std::allocator<dealii::SolverControl> > const&, std::exception const&,
    ompi_communicator_t*, std::__cxx11::basic_string<char,
    std::char_traits<char>, std::allocator<char> > const&)
    #1  aspect-release:
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,
    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    >::vmult(dealii::LinearAlgebra::distributed::BlockVector<double>&,
    dealii::LinearAlgebra::distributed::BlockVector<double> const&) const
    #2  aspect-release: void
    dealii::SolverFGMRES<dealii::LinearAlgebra::distributed::BlockVector<double>
    >::solve<aspect::MatrixFreeStokesOperators::StokesOperator<2, 2,
    double>,
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,
    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    > >(aspect::MatrixFreeStokesOperators::StokesOperator<2, 2, double>
    const&, dealii::LinearAlgebra::distributed::BlockVector<double>&,
    dealii::LinearAlgebra::distributed::BlockVector<double> const&,
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,
    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    > const&)
    #3  aspect-release: aspect::StokesMatrixFreeHandlerImplementation<2,
    2>::solve()
    #4  aspect-release: aspect::Simulator<2>::solve_stokes()
    #5  aspect-release:
    aspect::Simulator<2>::do_one_defect_correction_Stokes_step(aspect::DefectCorrectionResiduals&,
    bool)
    #6  aspect-release:
    aspect::Simulator<2>::solve_single_advection_and_iterated_newton_stokes(bool)
#7
    aspect-release: aspect::Simulator<2>::solve_timestep()
    #8  aspect-release: aspect::Simulator<2>::run()
    #9  aspect-release: void
    run_simulator<2>(std::__cxx11::basic_string<char,
    std::char_traits<char>, std::allocator<char> > const&,
    std::__cxx11::basic_string<char, std::char_traits<char>,
    std::allocator<char> > const&, bool, bool, bool, bool)
    #10  aspect-release: main
    --------------------------------------------------------


Stacktrace:
-----------
#0  aspect-release: aspect::Utilities::throw_linear_solver_failure_exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<dealii::SolverControl, std::allocator<dealii::SolverControl> > const&, std::exception const&, ompi_communicator_t*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#1  aspect-release: aspect::StokesMatrixFreeHandlerImplementation<2, 2>::solve()
#2  aspect-release: aspect::Simulator<2>::solve_stokes()
#3  aspect-release: aspect::Simulator<2>::do_one_defect_correction_Stokes_step(aspect::DefectCorrectionResiduals&, bool)
#4  aspect-release: aspect::Simulator<2>::solve_single_advection_and_iterated_newton_stokes(bool)
#5  aspect-release: aspect::Simulator<2>::solve_timestep()
#6  aspect-release: aspect::Simulator<2>::run()
#7  aspect-release: void run_simulator<2>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool, bool, bool)
#8  aspect-release: main
--------------------------------------------------------

Aborting!
----------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

Is there any solution?

Regards

Grace

cgw0814 · July 3, 2024, 8:09am

Hi~all,
I just noticed that the subsection Solver parameters has renewed
So, I modified the subsection Solver parameters, like Crustal_deformation_2D.prm crustal_deformation_2D.prm

like below

subsection Solver parameters
  subsection Stokes solver parameters
    set Stokes solver type = block AMG
    set Number of cheap Stokes solver steps = 0
    set Maximum number of expensive Stokes solver steps = 5000
    set GMRES solver restart length = 200
  end

  subsection Newton solver parameters
    set Maximum linear Stokes solver tolerance   = 1e-2
    set Use Eisenstat Walker method for Picard iterations = true
  end
end

This cookbook can run normally now.

I’m sorry to disturb you

Regards

Grace

Topic		Replies	Views
Problems in session 3 of CIG ASPECT tutorials ASPECT	9	257	July 24, 2024
Question regarding buckling model ASPECT	11	292	January 23, 2024
Failure on quadrants after running for long time ASPECT	17	614	August 20, 2019
ASPECT hangs at random spots when using more than one node ASPECT	7	237	March 20, 2023
Segmentation fault while rebuilding preconditioner ASPECT	31	1294	March 7, 2019

Problem running continental extension cookbook on recently installed version of ASPECT

Related topics