Skip to content

{2025.06-001}[2025b] netCDF 4.9.3 (LAMMPS dependency)#72

Merged
bedroge merged 13 commits intoEESSI:mainfrom
julianmorillo:netCDF
Mar 13, 2026
Merged

{2025.06-001}[2025b] netCDF 4.9.3 (LAMMPS dependency)#72
bedroge merged 13 commits intoEESSI:mainfrom
julianmorillo:netCDF

Conversation

@julianmorillo
Copy link
Copy Markdown
Contributor

No description provided.

@julianmorillo
Copy link
Copy Markdown
Contributor Author

bot: build repo:dev.eessi.io-riscv-2025.06-001 instance:eessi-bot-riscv for:arch=riscv64/generic

@riscv-eessi-io-bot
Copy link
Copy Markdown

riscv-eessi-io-bot bot commented Feb 27, 2026

New job on instance eessi-bot-riscv for repository dev.eessi.io-riscv-2025.06-001
Building on: generic
Building for: riscv64/generic
Job dir: /home/eessibot/shared/jobs/2026.02/pr_72/300277

date job status comment
Feb 27 16:43:08 UTC 2026 submitted job id 300277 awaits release by job manager
Feb 27 16:44:04 UTC 2026 released job awaits launch by Slurm scheduler
Feb 27 16:45:13 UTC 2026 running job 300277 is running
Feb 27 17:03:17 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-300277.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-riscv64-generic-riscv-1772211733.tar.zstsize: 0 MiB (22 bytes)
entries: 0
modules under 2025.06-001/software/linux/riscv64/generic/modules/all
no module files in tarball
software under 2025.06-001/software/linux/riscv64/generic/software
no software packages in tarball
reprod directories under 2025.06-001/software/linux/riscv64/generic/reprod
no reprod directories in tarball
other under 2025.06-001/software/linux/riscv64/generic
no other files in tarball
Feb 27 17:03:17 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job300277.test does not exist in job directory, or parsing it failed.

@julianmorillo
Copy link
Copy Markdown
Contributor Author

This is the error in the logs:

[ 22%] Building C object libdap2/CMakeFiles/dap2.dir/getvara.c.o
cd /tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/easybuild_obj/libdap2 && /cvmfs/dev.eessi.io/riscv/versions/2025.06-001/software/linux/riscv64/generic/software/OpenMPI/5.0.8-GCC-14.3.0/bin/mpicc --sysroot=/cvmfs/software.eessi.io/versions/2025.06/compat/linux/riscv64 -DHAVE_CONFIG_H -I/tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/easybuild_obj/include -I/tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/netcdf-c-4.9.3/include -I/tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/netcdf-c-4.9.3/oc2 -I/tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/netcdf-c-4.9.3/libsrc -I/tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/easybuild_obj -O2 -ftree-vectorize -march=rv64gc -mabi=lp64d -fno-math-errno -fPIC -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O3 -DNDEBUG -fPIC -DCURL_STATICLIB=1 -MD -MT libdap2/CMakeFiles/dap2.dir/getvara.c.o -MF CMakeFiles/dap2.dir/getvara.c.o.d -o CMakeFiles/dap2.dir/getvara.c.o -c /tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/netcdf-c-4.9.3/libdap2/getvara.c
In file included from /tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/netcdf-c-4.9.3/libncxml/ncxml_xml2.c:6:
/cvmfs/dev.eessi.io/riscv/versions/2025.06-001/software/linux/riscv64/generic/software/libxml2/2.14.3-GCCcore-14.3.0/include/libxml2/libxml/parser.h:92:27: error: expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘__attribute__’ before ‘XML_DEPRECATED_MEMBER’
   92 |     const char *directory XML_DEPRECATED_MEMBER;
      |                           ^~~~~~~~~~~~~~~~~~~~~
/cvmfs/dev.eessi.io/riscv/versions/2025.06-001/software/linux/riscv64/generic/software/libxml2/2.14.3-GCCcore-14.3.0/include/libxml2/libxml/parser.h:250:25: error: expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘__attribute__’ before ‘XML_DEPRECATED_MEMBER’
  250 |     int replaceEntities XML_DEPRECATED_MEMBER;
      |                         ^~~~~~~~~~~~~~~~~~~~~
/cvmfs/dev.eessi.io/riscv/versions/2025.06-001/software/linux/riscv64/generic/software/libxml2/2.14.3-GCCcore-14.3.0/include/libxml2/libxml/parser.h:1483:42: error: unknown type name ‘xmlCharEncConvImpl’; did you mean ‘xmlCharEncoding’?
 1483 |                                          xmlCharEncConvImpl impl,
      |                                          ^~~~~~~~~~~~~~~~~~
      |                                          xmlCharEncoding
make[2]: *** [libncxml/CMakeFiles/ncxml.dir/build.make:82: libncxml/CMakeFiles/ncxml.dir/ncxml_xml2.c.o] Error 1
make[2]: Leaving directory '/tmp/eessibot/easybuild/build/netCDF/4.9.3/gompi-2025b/easybuild_obj'
make[1]: *** [CMakeFiles/Makefile2:2898: libncxml/CMakeFiles/ncxml.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

@julianmorillo
Copy link
Copy Markdown
Contributor Author

julianmorillo commented Mar 2, 2026

libxml2 was added as an explicit dependency to netCDF here: easybuilders/easybuild-easyconfigs@3f74f3b

But it is not being considered here: https://github.com/easybuilders/easybuild-easyblocks/blob/c0b3a8983af664f7226eaa9e503840f6b8c3bbf2/easybuild/easyblocks/n/netcdf.py#L71

I think this is causing trouble because the build of netCDF is mixing two libxml2 installations (versions):

  • The one provided by the EESSI compat layer (2.13.8) in this case
  • And the one provided by EasyBuild (2.14.3)

@julianmorillo
Copy link
Copy Markdown
Contributor Author

@bedroge , what do you think?

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 3, 2026

@bedroge , what do you think?

I think your explanation makes sense. I only wondered why we hadn't seen this with software.eessi.io, but apparently we haven't installed netCDF yet for 2025a/b. Let me try that and see if I can reproduce the issue there, as it makes the debugging a bit easier.

Comment thread easystacks/riscv/2025.06-001/eessi-2025.06-eb-5.2.1-2025b.yml Outdated
@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 3, 2026

bot: build repo:dev.eessi.io-riscv-2025.06-001 instance:eessi-bot-riscv for:arch=riscv64/generic

@riscv-eessi-io-bot
Copy link
Copy Markdown

riscv-eessi-io-bot bot commented Mar 3, 2026

New job on instance eessi-bot-riscv for repository dev.eessi.io-riscv-2025.06-001
Building on: generic
Building for: riscv64/generic
Job dir: /home/eessibot/shared/jobs/2026.03/pr_72/300382

date job status comment
Mar 03 13:44:32 UTC 2026 submitted job id 300382 awaits release by job manager
Mar 03 13:45:37 UTC 2026 released job awaits launch by Slurm scheduler
Mar 03 13:46:43 UTC 2026 running job 300382 is running
Mar 03 14:49:51 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-300382.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-riscv64-generic-riscv-1772549298.tar.zstsize: 0 MiB (22 bytes)
entries: 0
modules under 2025.06-001/software/linux/riscv64/generic/modules/all
no module files in tarball
software under 2025.06-001/software/linux/riscv64/generic/software
no software packages in tarball
reprod directories under 2025.06-001/software/linux/riscv64/generic/reprod
no reprod directories in tarball
other under 2025.06-001/software/linux/riscv64/generic
no other files in tarball
Mar 03 14:49:51 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job300382.test does not exist in job directory, or parsing it failed.

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 3, 2026

The build passes, but the tests are failing due to this again:

[premier-2:137003] PMIX ERROR: PMIX_ERROR in file gds_shmem2.c at line 1056
[premier-2:137003] PMIX ERROR: PMIX_ERROR in file gds_shmem2.c at line 1231
[premier-2:137003] PMIX ERROR: PMIX_ERROR in file gds_shmem2.c at line 1353
[premier-2:137003] PMIX ERROR: PMIX_ERROR in file gds_shmem2.c at line 2405
[premier-2:137003] PMIX ERROR: PMIX_ERROR in file gds_shmem2.c at line 2460
[premier-2:137003] PMIX ERROR: PMIX_ERROR in file gds_shmem2.c at line 2476
[premier-2:137003] PMIX ERROR: PMIX_ERROR in file server/pmix_server.c at line 4698
--------------------------------------------------------------------------
The gds/shmem2 component attempted to attach to a shared-memory segment at a
particular base address, but was given a different one. Your job will now likely
abort.

  Requested Address: 0x3ffd90000000
  Acquired Address:  0x3afdc1d000

If this problem persists, please consider disabling the gds/shmem2 component by
setting in your environment the following: PMIX_MCA_gds=hash
--------------------------------------------------------------------------

[premier-2:137006] PMIX ERROR: PMIX_ERR_OUT_OF_RESOURCE in file client/pmix_client.c at line 278
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and MPI will try to terminate your MPI job as well)
[premier-2:137006] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
PMIx_Init failed for the following reason:

  PMIX_ERROR

Open MPI requires access to a local PMIx server to execute. Please ensure
that either you are operating in a PMIx-enabled environment, or use "mpirun"
to execute the job.
--------------------------------------------------------------------------

That seems to happen for all build jobs on premier that do something with MPI...

@julianmorillo
Copy link
Copy Markdown
Contributor Author

Admins suggested me to add:

export PMIX_MCA_gds=hash
export SLURM_MPI_TYPE="pmix"

So adding these exports into site_config.sh script.

@julianmorillo
Copy link
Copy Markdown
Contributor Author

bot: build repo:dev.eessi.io-riscv-2025.06-001 instance:eessi-bot-riscv for:arch=riscv64/generic

@riscv-eessi-io-bot
Copy link
Copy Markdown

riscv-eessi-io-bot bot commented Mar 3, 2026

New job on instance eessi-bot-riscv for repository dev.eessi.io-riscv-2025.06-001
Building on: generic
Building for: riscv64/generic
Job dir: /home/eessibot/shared/jobs/2026.03/pr_72/300401

date job status comment
Mar 03 15:52:09 UTC 2026 submitted job id 300401 awaits release by job manager
Mar 03 15:52:53 UTC 2026 released job awaits launch by Slurm scheduler
Mar 03 15:53:59 UTC 2026 running job 300401 is running
Mar 03 16:57:12 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-300401.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-riscv64-generic-riscv-1772556914.tar.zstsize: 0 MiB (22 bytes)
entries: 0
modules under 2025.06-001/software/linux/riscv64/generic/modules/all
no module files in tarball
software under 2025.06-001/software/linux/riscv64/generic/software
no software packages in tarball
reprod directories under 2025.06-001/software/linux/riscv64/generic/reprod
no reprod directories in tarball
other under 2025.06-001/software/linux/riscv64/generic
no other files in tarball
Mar 03 16:57:12 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job300401.test does not exist in job directory, or parsing it failed.

@julianmorillo
Copy link
Copy Markdown
Contributor Author

julianmorillo commented Mar 3, 2026

Tested with a simple HelloWorld testcase, and export PMIX_MCA_gds=hash solved the issue:

jmorillo@premier-2:~/mpitutorial/tutorials/mpi-hello-world/code$ export MPICC=/cvmfs/dev.eessi.io/riscv/versions/2025.06-001/software/linux/riscv64/generic/software/OpenMPI/5.0.8-GCC-14.3.0/bin/mpicc
jmorillo@premier-2:~/mpitutorial/tutorials/mpi-hello-world/code$ make
/cvmfs/dev.eessi.io/riscv/versions/2025.06-001/software/linux/riscv64/generic/software/OpenMPI/5.0.8-GCC-14.3.0/bin/mpicc -o mpi_hello_world mpi_hello_world.c
jmorillo@premier-2:~/mpitutorial/tutorials/mpi-hello-world/code$ ls
makefile  mpi_hello_world  mpi_hello_world.c
jmorillo@premier-2:~/mpitutorial/tutorials/mpi-hello-world/code$ export MPIRUN=/cvmfs/dev.eessi.io/riscv/versions/2025.06-001/software/linux/riscv64/generic/software/OpenMPI/5.0.8-GCC-14.3.0/bin/mpirun
jmorillo@premier-2:~/mpitutorial/tutorials/mpi-hello-world/code$ $MPIRUN -n 4 ./mpi_hello_world
[premier-2:219870] PMIX ERROR: PMIX_ERROR in file gds_shmem2.c at line 1056
[premier-2:219870] PMIX ERROR: PMIX_ERROR in file gds_shmem2.c at line 1231
...
jmorillo@premier-2:~/mpitutorial/tutorials/mpi-hello-world/code$ export PMIX_MCA_gds=hash
jmorillo@premier-2:~/mpitutorial/tutorials/mpi-hello-world/code$ $MPIRUN -n 4 ./mpi_hello_world
Hello world from processor premier-2, rank 3 out of 4 processors
Hello world from processor premier-2, rank 0 out of 4 processors
Hello world from processor premier-2, rank 2 out of 4 processors
Hello world from processor premier-2, rank 1 out of 4 processors

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 3, 2026

Ah, great!

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 3, 2026

It looks like the sourcing of that site_config script may not work, I think it's related to running in the dev repo. I'll look into it.

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 4, 2026

bot: build repo:dev.eessi.io-riscv-2025.06-001 instance:eessi-bot-riscv for:arch=riscv64/generic

@riscv-eessi-io-bot
Copy link
Copy Markdown

riscv-eessi-io-bot bot commented Mar 4, 2026

New job on instance eessi-bot-riscv for repository dev.eessi.io-riscv-2025.06-001
Building on: generic
Building for: riscv64/generic
Job dir: /home/eessibot/shared/jobs/2026.03/pr_72/300439

date job status comment
Mar 04 08:07:11 UTC 2026 submitted job id 300439 awaits release by job manager
Mar 04 08:08:12 UTC 2026 released job awaits launch by Slurm scheduler
Mar 04 08:09:17 UTC 2026 running job 300439 is running
Mar 04 16:12:55 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-300439.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-riscv64-generic-riscv-1772615461.tar.zstsize: 0 MiB (22 bytes)
entries: 0
modules under 2025.06-001/software/linux/riscv64/generic/modules/all
no module files in tarball
software under 2025.06-001/software/linux/riscv64/generic/software
no software packages in tarball
reprod directories under 2025.06-001/software/linux/riscv64/generic/reprod
no reprod directories in tarball
other under 2025.06-001/software/linux/riscv64/generic
no other files in tarball
Mar 04 16:12:55 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job300439.test does not exist in job directory, or parsing it failed.

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 12, 2026

bot: build repo:dev.eessi.io-riscv-2025.06-001 instance:eessi-bot-riscv for:arch=riscv64/generic

@riscv-eessi-io-bot
Copy link
Copy Markdown

riscv-eessi-io-bot bot commented Mar 12, 2026

New job on instance eessi-bot-riscv for repository dev.eessi.io-riscv-2025.06-001
Building on: generic
Building for: riscv64/generic
Job dir: /home/eessibot/shared/jobs/2026.03/pr_72/301224

date job status comment
Mar 12 09:54:32 UTC 2026 submitted job id 301224 awaits release by job manager
Mar 12 09:55:34 UTC 2026 released job awaits launch by Slurm scheduler
Mar 12 09:57:06 UTC 2026 running job 301224 is running
Mar 12 11:01:57 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-301224.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-riscv64-generic-riscv-1773313203.tar.zstsize: 0 MiB (22 bytes)
entries: 0
modules under 2025.06-001/software/linux/riscv64/generic/modules/all
no module files in tarball
software under 2025.06-001/software/linux/riscv64/generic/software
no software packages in tarball
reprod directories under 2025.06-001/software/linux/riscv64/generic/reprod
no reprod directories in tarball
other under 2025.06-001/software/linux/riscv64/generic
no other files in tarball
Mar 12 11:01:57 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job301224.test does not exist in job directory, or parsing it failed.

@julianmorillo
Copy link
Copy Markdown
Contributor Author

@bedroge , 3 tests failing with the same PMIX error as before. We need the sourcing of site_config.sh to work (as it includes the export PMIX_MCA_gds=hash that should prevent the errors). You said it may be related with the fact of running from the dev repo but I was not able to find what the exact problem is...

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 12, 2026

Oh, right, I forgot about that.

The site config script would not work for this, unfortunately, as the build script sort of wipes the environment again. An alternative would be using Lmod hooks for this, but I couldn't immediately get that to work, as it's looking in a non-existing location in the dev repo. I can try to come up with a fix for that by making a symlink.

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 12, 2026

EESSI/filesystem-layer#267 should fix this and allow us to use site-specific Lmod hooks for dev.eessi.io.

@bedroge
Copy link
Copy Markdown
Contributor

bedroge commented Mar 12, 2026

bot: build repo:dev.eessi.io-riscv-2025.06-001 instance:eessi-bot-riscv for:arch=riscv64/generic

@riscv-eessi-io-bot
Copy link
Copy Markdown

riscv-eessi-io-bot bot commented Mar 12, 2026

New job on instance eessi-bot-riscv for repository dev.eessi.io-riscv-2025.06-001
Building on: generic
Building for: riscv64/generic
Job dir: /home/eessibot/shared/jobs/2026.03/pr_72/301287

date job status comment
Mar 12 15:23:52 UTC 2026 submitted job id 301287 awaits release by job manager
Mar 12 15:24:42 UTC 2026 released job awaits launch by Slurm scheduler
Mar 12 16:59:27 UTC 2026 running job 301287 is running
Mar 12 19:14:48 UTC 2026 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-301287.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-riscv64-generic-riscv-1773342768.tar.zstsize: 2 MiB (2475475 bytes)
entries: 57
modules under 2025.06-001/software/linux/riscv64/generic/modules/all
netCDF/4.9.3-gompi-2025b.lua
software under 2025.06-001/software/linux/riscv64/generic/software
netCDF/4.9.3-gompi-2025b
reprod directories under 2025.06-001/software/linux/riscv64/generic/reprod
no reprod directories in tarball
other under 2025.06-001/software/linux/riscv64/generic
no other files in tarball
Mar 12 19:14:48 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job301287.test does not exist in job directory, or parsing it failed.
Mar 13 08:04:03 UTC 2026 uploaded transfer of eessi-2025.06-software-linux-riscv64-generic-riscv-1773342768.tar.zst to S3 bucket succeeded

@bedroge bedroge merged commit cbc5252 into EESSI:main Mar 13, 2026
@julianmorillo julianmorillo deleted the netCDF branch March 13, 2026 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants