Skip to content

Incremental CPPE migration to Python backend #58

@maxscheurer

Description

@maxscheurer

Goal

Migrate CPPE incrementally from the current C++ core to a Python-native backend while preserving API behavior and numerical results, with C++ retained as a reference/fallback backend during the transition.

Why this approach

  • Reduce build and deployment friction from mandatory C++ compilation.
  • Keep scientific reliability by validating every Python component against the existing C++ implementation.
  • Allow gradual adoption and rollback by switching backend at runtime.

Scope and principles

  • Backend strategy: hybrid first (cpp + python), then switch default only after parity/performance gates are met.
  • Porting order: non-FMM first; FMM last.
  • Validation discipline: every migrated unit must have cpp-vs-python parity tests.
  • Benchmarking axis: number of polarizable sites (n_polsites) as primary scaling variable.

Success criteria

  • API compatibility for Python users across key classes (CppeState, BMatrix, MultipoleFields, etc.).
  • Numerical parity vs C++ within agreed tolerances for all migrated components.
  • Competitive runtime for Python backend on representative systems/sizes.
  • C++ build becomes optional first, then removable after one stabilization cycle.

Milestones

  1. Baseline tests and benchmark harness
  2. Backend selection plumbing (cpp vs python)
  3. Non-FMM Python port (fields/solver/state flow)
  4. Numba parallelization and tuning
  5. Potfile Python reader parity
  6. FMM migration design (generator strategy)
  7. FMM Python port + validation
  8. Default backend switch
  9. C++ backend deprecation and removal

Checklist

  • Define backend selection mechanism and configuration surface
  • Add parity test matrix (component-level and end-to-end)
  • Add pytest-parametrized benchmark suite by n_polsites
  • Benchmark both backends in same tests (cpp, python)
  • Capture warmup vs steady-state timings in benchmark output
  • Add numerical-delta reporting alongside timing results
  • Port NuclearFields to Python backend
  • Port MultipoleExpansion to Python backend
  • Port BMatrix apply/diagonal ops to Python backend
  • Port induced moments solver path to Python backend
  • Add Numba acceleration (njit, parallel=True, prange) to hot kernels
  • Document Numba runtime tuning knobs (NUMBA_NUM_THREADS, threading layer)
  • Implement Python potfile parser with strict parity to current CPPE semantics
  • Instantiate CPPE-like Python domain objects from parsed potfiles
  • Add reader parity tests against existing potfiles in tests/potfiles
  • Decide FMM generator path (adapt fmmgen or replacement codegen)
  • Specify FMM acceptance criteria (error envelope + speed targets)
  • Port tree/interactions/FMM kernels to Python backend
  • Validate FMM against tests/test_fmm.py and reference error data
  • Switch default backend to Python after gates pass
  • Keep C++ backend as temporary fallback for one release cycle
  • Remove mandatory C++ build from project deployment

Benchmarking plan (pytest-first)

  • Use pytest parametrization for:
    • backend: cpp, python
    • workload: fields, solver, end-to-end
    • system size buckets keyed by n_polsites
  • Start with lightweight pytest timing; later evaluate pytest-benchmark and/or pytest-harvest for richer persistence/reporting.

Potfile reader notes

  • First target: strict parity with CPPE's current reader behavior (@COORDINATES, ORDER, ORDER 1 1, EXCLISTS, AU/AA conversion, zero-fill behavior).
  • Follow-up milestone: consider optional support for extended sections inspired by polarizationsolver (EXCLISTS_PERMANENT, EXCLISTS_INDUCED, @THOLE, distributed polarizabilities).

Risks and mitigations

  • Risk: Irregular/tree-heavy kernels may underperform in pure Python.
    • Mitigation: Keep C++ fallback, focus on Numba for hot loops, optimize data layout before FMM switch.
  • Risk: Numerical drift between backends.
    • Mitigation: enforce parity checks in CI and per-milestone acceptance gates.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions