Hi,
Recently I’ve been working with the ARCHER2 help desk and tried installing the latest version of ASPECT on the ARCHER2 supercomputer (https://www.archer2.ac.uk/), but I’m having a problem with the latest version of ASPECT (ver. 2.6-pre, f75d95afe).
Firstly, I tried installing ASPECT 2.5 (bafd9df3e) from GitHub - geodynamics/aspect at aspect-2.5. Long story short, after a lot of back and forth with the helpdesk I managed to get this version of ASPECT working, yay. Attached is the working instructions on how to install ASPECT (ver. 2.5, bafd9df3e) on ARCHER2 (without LAPACK or BLAS).
Subsuquently, I tried updating to the main branch (2.6-pre, f75d95afe when I last tried), and after finding out how to install SUNDIALS, the installation process went fine with no error messages (again attached is the installation instructions I used). However, when I started running a handful of cookbook models it became quite noticeable that the models I ran were taking significantly longer than they should do. To test what was wrong I ran the continental extension cookbook model on the working version of 2.5 and on this recently installed version of 2.6-pre until 10 Myr. Overall, both models look relatively similar (images below) so nothing drastically unexpected happened, but ver. 2.5 took ~21 minutes while 2.6-pre took ~250 minutes on one node (128 cores). After looking at the postprocessing it became apparent that something wan’t right (highlighted in bold is the key differences between the two models):
Using ASPECT 2.5:
±---------------------------------------------±-----------±-----------+
| Total wallclock time elapsed since start | 1.3e+03s | |
| | | |
| Section | no. calls | wall time | % of total |
| Assemble Stokes system | 502 | 9.24s | 0.71% |
| Assemble Stokes system Picard | 580 | 10.6s | 0.82% |
| Assemble Stokes system rhs | 78 | 1.23s | 0% |
| Assemble composition system | 2510 | 46.9s | 3.6% |
| Assemble temperature system | 502 | 12.4s | 0.95% |
| Build Stokes preconditioner | 580 | 21.7s | 1.7% |
| Build composition preconditioner | 2508 | 2.12s | 0.16% |
| Build temperature preconditioner | 502 | 0.436s | 0% |
| Initialization | 1 | 1.1s | 0% |
| Mesh deformation | 502 | 17.1s | 1.3% |
| Mesh deformation initialize | 2 | 1.58s | 0.12% |
| Postprocessing | 501 | 35.8s | 2.8% |
| Refine mesh structure, part 1 | 1 | 0.0429s | 0% |
| Refine mesh structure, part 2 | 1 | 0.056s | 0% |
| Setup dof systems | 2 | 1.98s | 0.15% |
| Setup initial conditions | 2 | 0.294s | 0% |
| Setup matrices | 502 | 64.5s | 5% |
| Solve Stokes system | 580 | 1.05e+03s | 81% |
| Solve composition system | 2508 | 15.5s | 1.2% |
| Solve temperature system | 502 | 2.99s | 0.23% |
Using ASPECT 2.6-pre:
±---------------------------------------------±-----------±-----------+
| Total wallclock time elapsed since start | 1.51e+04s | |
| | | |
| Section | no. calls | wall time | % of total |
| Assemble Stokes system | 502 | 8.91s | 0% |
| Assemble Stokes system Picard | 3998 | 71.9s | 0.48% |
| Assemble Stokes system rhs | 3496 | 59.1s | 0.39% |
| Assemble composition system | 2510 | 61.2s | 0.41% |
| Assemble temperature system | 502 | 12.5s | 0% |
| Build Stokes preconditioner | 3998 | 147s | 0.98% |
| Build composition preconditioner | 2508 | 2.03s | 0% |
| Build temperature preconditioner | 502 | 0.411s | 0% |
| Initialization | 1 | 1.07s | 0% |
| Mesh deformation | 502 | 17.8s | 0.12% |
| Mesh deformation initialize | 2 | 1.21s | 0% |
| Postprocessing | 501 | 43.5s | 0.29% |
| Refine mesh structure, part 1 | 1 | 0.0369s | 0% |
| Refine mesh structure, part 2 | 1 | 0.0441s | 0% |
| Setup dof systems | 2 | 1.58s | 0% |
| Setup initial conditions | 2 | 0.164s | 0% |
| Setup matrices | 2 | 0.264s | 0% |
| Solve Stokes system | 3998 | 1.46e+04s | 97% |
| Solve composition system | 2508 | 15.5s | 0.1% |
| Solve temperature system | 502 | 2.93s | 0% |
±---------------------------------±----------±-----------±-----------+
Full outputs are attached as well.
My main question here is what’s causing this to happen? Also, is there any fix to this? I noticed that the main branch lists SUNDIALS as a requirement now so is it setup correctly? Since ARCHER2 is a Cray system we’re avoiding installing and configuring LAPACK and BLAS, could not having LAPACK and BLAS be another potential issue here?
Any help is much appreciated.
Thanks,
Luke
installing_aspect_archer2.pdf (90.9 KB)
slurm_ver.2.6.txt (2.6 MB)
slurm_ver.2.5.txt (1.7 MB)