Skip to content

Parallel HDF5: Try fixing strict collective requirements of HDF5 >= 2.0#1862

Draft
franzpoeschel wants to merge 98 commits intoopenPMD:devfrom
franzpoeschel:fix-parallel-hdf5
Draft

Parallel HDF5: Try fixing strict collective requirements of HDF5 >= 2.0#1862
franzpoeschel wants to merge 98 commits intoopenPMD:devfrom
franzpoeschel:fix-parallel-hdf5

Conversation

@franzpoeschel
Copy link
Copy Markdown
Contributor

@franzpoeschel franzpoeschel commented Mar 11, 2026

It seems that HDF5 has become quite a bit pickier about metadata definitions in parallel setups with versions 2.0 and 2.1, leading to hangups.
Earlier, it was enough to define them consistently across ranks, now we apparently have to keep the exact same order of operations.

This is bad for the Span API which runs internal flushes for structure setup.

  • No longer flush inside the Span API, instead flush already at resetDataset()
  • Try if we can only enqueue operations and run them later, should be fine for the ordering
  • Ensure that only a select type of operation runs in structural setup flushes
  • Remove flushParticlesPath and flushMeshesPath functions, these unnecessarily leaked attribute flushes into the structure setup
  • Fix tests...
  • Manually go through tests and check if any of them silently swallows some errors, the chance is high
  • Guard setDirty calls
  • There are some commits deactivating a handful of tests, deal with them
  • Decide how to go forward with the changes inside resetDataset(). Best idea: add the new logic to a new API call commitStructuralSetup() or so.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants