Fix for parallel reproducibility issue with 3d_smagorinsky#1445
Open
abishekg7 wants to merge 2 commits intoMPAS-Dev:release-v8.4.0from
Open
Fix for parallel reproducibility issue with 3d_smagorinsky#1445abishekg7 wants to merge 2 commits intoMPAS-Dev:release-v8.4.0from
3d_smagorinsky#1445abishekg7 wants to merge 2 commits intoMPAS-Dev:release-v8.4.0from
Conversation
mgduda
requested changes
Apr 15, 2026
…ity fields This commit introduces a new halo exchange group in mpas_atm_halos in order to perform halo exchanges for the two eddy viscosity fields relevant to LES model runs. This group is added to both mpas_dmpar and mpas_halo halo exchange methods in mpas_atm_halos. The halo exchanges of eddy_visc_horz and eddy_visc_vert appear to fix the parallel reproducibility issues found in some LES runs, when using 3d_smagorinsky model option. The addition of this halo exchange will appear in the subsequent commit.
…_tend This commit introduces a new halo exchange for the two eddy viscosity fields relevant to LES runs, eddy_visc_horz and eddy_visc_vert, immediately after the call to les_models in atm_compute_dyn_tend. This change also requires the addition of new arguments to the atm_compute_dyn_tend and atm_compute_dyn_tend_work subroutines. This changes is required to fix parallel reproducibility issues observed in LES model runs when config_les_model= 3d_smagorinsky, affecting both CPU and GPU runs. By trial and error, it was observed that performing halo exchanges of the two fieds seemed to fix this error. The exact source of this issue has not been understood yet and would need to be revisited.
405ae23 to
a6bc145
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes parallel reproducibility issues (For ex, runs with 32 and 64 MPI tasks) when
config_les_model= 3d_smagorinsky. This issue affects both CPU and GPU runs. This issue does not seem to affect parallel reproducibility whenconfig_les_model= prognostic_1.5_order.This PR firstly introduces a new halo exchange group in
mpas_atm_halosin order to perform halo exchanges for the two eddy viscosity fields relevant to LES model runs. This group is added to bothmpas_dmparandmpas_halohalo exchange methods inmpas_atm_halos. Then a halo exchange is performed using this group immediately after the call toles_modelsinatm_compute_dyn_tend.Background
With trial and error, it was observed that performing halo exchanges of the
eddy_visc_horzandeddy_visc_vertfields, immediately after the call toles_modelsseemed to fix this error. The exact mechanism of this error propagation has not been fully understood yet and would need to be revisited.However, the issue originating with
eddy_visc_horzmight involve this line. Both serialbox and simpler print debugging point to issues originating from this loop. The max(cell1,cell2) values in the loop seems to exceedcellEnd.It is also not yet understood why these discrepancies are not observed with
config_les_model= prognostic_1.5_order