Skip to content

Add wave equation and Klein-Gordon equation benchmark tasks#97

Open
gpartin wants to merge 3 commits intopdebench:mainfrom
gpartin:add-wave-klein-gordon
Open

Add wave equation and Klein-Gordon equation benchmark tasks#97
gpartin wants to merge 3 commits intopdebench:mainfrom
gpartin:add-wave-klein-gordon

Conversation

@gpartin
Copy link

@gpartin gpartin commented Mar 11, 2026

Summary

This PR adds two new PDE benchmark tasks to PDEBench: the wave equation and the Klein-Gordon equation in 1D and 2D with periodic boundary conditions.

Equations

Wave equation: $\partial^2 u / \partial t^2 = c^2 \nabla^2 u$

Klein-Gordon: $\partial^2 u / \partial t^2 = c^2 \nabla^2 u - \chi^2 u$

Why these benchmarks?

  1. Hyperbolic PDEs - wave equations are underrepresented in PDEBench (most tasks are parabolic/elliptic)
  2. Variable wave speed - \c\ parameter creates a natural difficulty ladder
  3. Klein-Gordon mass parameter - \chi\ introduces a propagating-to-evanescent transition that exposes catastrophic extrapolation failure in neural operators
  4. Analytical validation - exact Fourier solutions available for solver verification

Baseline results (FNO, 100 epochs, 1D)

Wave speed c nRMSE
0.1 0.101
0.4 0.112
1.0 0.099
2.0 0.095

Klein-Gordon cross-chi generalization (FNO)

Training on one chi value and testing on another reveals:

Train \ Test 0.5 1.0 2.0 5.0
0.5 0.093 0.096 0.174 0.891
1.0 0.097 0.094 0.154 0.877
2.0 0.170 0.150 0.095 0.789
5.0 0.783 0.771 0.707 0.098

FNO extrapolates well for small parameter shifts but catastrophically fails across the propagating-to-evanescent transition (chi=2 to chi=5: nRMSE jumps from 0.095 to 0.789).

Files added

  • \pdebench/data_gen/src/sim_wave.py\ - Pure NumPy simulator (leapfrog, 1D/2D)
  • \pdebench/data_gen/gen_wave.py\ - Hydra-based data generation with multiprocessing
  • \pdebench/data_gen/configs/wave.yaml\ - Default generation config
  • \pdebench/models/config/args/config_wave.yaml\ - FNO/UNet training config
  • \pdebench/models/config/args/config_klein_gordon.yaml\ - KG training config
  • \WAVE_BENCHMARK.md\ - Full documentation with baseline results

Files modified

  • \README.md\ - Added gen_wave.py entry in Data Generation section

See WAVE_BENCHMARK.md for full details.

Copilot AI review requested due to automatic review settings March 11, 2026 16:02
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds new PDEBench benchmark tasks and supporting artifacts for the 1D/2D wave equation and Klein–Gordon equation, including a NumPy-based simulator, a Hydra-based dataset generator, training configs, and documentation.

Changes:

  • Added WaveSimulator (1D/2D) and an FFT-based 1D analytical solution helper.
  • Added gen_wave.py + wave.yaml to generate datasets in a PDEBench-style HDF5 layout.
  • Added model argument configs and benchmark documentation; updated README to reference the new generator.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pdebench/models/config/args/config_wave.yaml Adds a wave-equation training config (FNO/UNet params).
pdebench/models/config/args/config_klein_gordon.yaml Adds a Klein–Gordon training config (parameterized by χ).
pdebench/data_gen/src/sim_wave.py Implements leapfrog/Verlet simulator and 1D analytical solution.
pdebench/data_gen/gen_wave.py Implements Hydra-driven dataset generation + tensor-format consolidation.
pdebench/data_gen/configs/wave.yaml Adds default Hydra config for wave/KG generation.
WAVE_BENCHMARK.md Documents equations, generation usage, and baseline results.
README.md Adds gen_wave.py to the data generation section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +172 to +180
lap = laplacian(u_curr)
u_next = 2 * u_curr - u_prev + c2dt2 * lap - chi2dt2 * u_curr
u_prev = u_curr
u_curr = u_next

if save_idx < self.Nt and step % save_interval == 0:
result[save_idx] = u_curr.astype(np.float32)
save_idx += 1

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The leapfrog loop saves u_curr after updating it, but u_curr is initialized to the t=dt state (via the Taylor half-step) and then advanced immediately in the first iteration. This makes the saved frame for step=1 correspond to ~t=2*dt (and in general shifts snapshot times by one step), further desynchronizing the saved solution from the t_save coordinate. Saving u_curr before advancing (or adjusting the step/time bookkeeping so u_curr corresponds to step*dt) would fix the off-by-one.

Suggested change
lap = laplacian(u_curr)
u_next = 2 * u_curr - u_prev + c2dt2 * lap - chi2dt2 * u_curr
u_prev = u_curr
u_curr = u_next
if save_idx < self.Nt and step % save_interval == 0:
result[save_idx] = u_curr.astype(np.float32)
save_idx += 1
if save_idx < self.Nt and step % save_interval == 0:
result[save_idx] = u_curr.astype(np.float32)
save_idx += 1
lap = laplacian(u_curr)
u_next = 2 * u_curr - u_prev + c2dt2 * lap - chi2dt2 * u_curr
u_prev = u_curr
u_curr = u_next

Copilot uses AI. Check for mistakes.
Comment on lines +75 to +86
f.create_dataset(
f"{seed_str}/grid/x",
data=sim.x.astype(np.float32),
dtype="float32",
compression="lzf",
)
f.create_dataset(
f"{seed_str}/grid/t",
data=sim.t_save.astype(np.float32),
dtype="float32",
compression="lzf",
)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 2D runs (sim.ndim=2), this only writes grid/x and later only exports x-coordinate. PDEBench 2D datasets typically include both x-coordinate and y-coordinate, and the model loaders (e.g. PINN/FNO utilities) expect y-coordinate to exist for 2D problems. Please write grid/y (likely the same 1D coordinate as x for a square domain) when ndim==2 so the generated HDF5 is self-describing for 2D.

Copilot uses AI. Check for mistakes.
Comment on lines +106 to +129
# Get shape from first sample
first_key = str(0).zfill(4)
sample_shape = f_in[f"{first_key}/data"].shape

x_coord = np.array(f_in[f"{first_key}/grid/x"])
t_coord = np.array(f_in[f"{first_key}/grid/t"])

# Allocate combined tensor
full_shape = (n_samples, *sample_shape)

with h5py.File(str(output_path), "w") as f_out:
tensor = f_out.create_dataset(
"tensor",
shape=full_shape,
dtype="float32",
compression="lzf",
)
for i in range(n_samples):
key = str(i).zfill(4)
if key in f_in:
tensor[i] = f_in[f"{key}/data"]

f_out.create_dataset("x-coordinate", data=x_coord)
f_out.create_dataset("t-coordinate", data=t_coord)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combine_to_tensor_format() only copies x-coordinate and t-coordinate into the output file. For 2D wave/KG data the output should also include y-coordinate (and optionally z-coordinate for higher dims) to match the conventions used elsewhere in the repo and to be consumable by existing loaders. You can infer whether it is 2D from sample_shape (len==3 for 2D per-sample) and copy grid/y from the raw file when applicable.

Copilot uses AI. Check for mistakes.
Comment on lines +123 to +127
for i in range(n_samples):
key = str(i).zfill(4)
if key in f_in:
tensor[i] = f_in[f"{key}/data"]

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a seed group is missing in the raw HDF5 (e.g. a worker crashed), the code silently leaves the corresponding slice of tensor as all zeros because it skips missing keys. This can produce corrupted datasets without any signal. Consider validating that all expected keys exist (or collecting the present keys and writing a smaller tensor) and raising/logging an error when samples are missing.

Copilot uses AI. Check for mistakes.

work_dir: ${hydra:runtime.cwd}
data_dir: data
upload: false
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upload: false is defined in this config, but gen_wave.py currently never checks config.upload nor performs an upload step (unlike other generators). Either implement the upload path for consistency or remove/rename the config field to avoid suggesting functionality that doesn't exist.

Suggested change
upload: false

Copilot uses AI. Check for mistakes.
Comment on lines +158 to +167
# Save schedule
if self.ndim == 1:
result = np.zeros((self.Nt, self.Nx), dtype=np.float32)
else:
result = np.zeros((self.Nt, self.Nx, self.Nx), dtype=np.float32)

result[0] = u0.astype(np.float32)
save_idx = 1
save_interval = max(1, self.n_steps // (self.Nt - 1))

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t_save is defined as linspace(0, T, Nt), but the saving logic uses save_interval = n_steps // (Nt - 1) and only saves when step % save_interval == 0. This generally produces snapshots at times that do not match t_save (and may skip the final time T if n_steps is not an exact multiple of Nt-1). Consider computing an explicit monotone list of save_steps that matches t_save (including the final step) and saving exactly at those steps, or derive t_save from the actual saved step indices.

Copilot uses AI. Check for mistakes.
- Save before advance in leapfrog loop to fix off-by-one snapshot timing
- Precompute save steps from t_save for exact time alignment
- Write grid/y for 2D simulations in per-seed HDF5
- Copy y-coordinate into combined tensor format for 2D
- Raise KeyError for missing seeds instead of silent zero-fill
- Remove unused 'upload' config field from wave.yaml
10 tests covering:
- 1D/2D output shape and dtype (float32)
- Finite output (no NaN/Inf)
- Invalid ndim raises ValueError
- Klein-Gordon chi>0 runs in 1D and 2D
- Leapfrog vs analytical solution nRMSE < 1% (wave and KG)
- analytical_solution_1d returns u0 at t=0

All tests pass in 0.33s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants