Skip to content

Part 1: isolate unified platform base#264

Closed
Siel wants to merge 26 commits into
mainfrom
split/unified-platform-base
Closed

Part 1: isolate unified platform base#264
Siel wants to merge 26 commits into
mainfrom
split/unified-platform-base

Conversation

@Siel
Copy link
Copy Markdown
Member

@Siel Siel commented Apr 2, 2026

This is part 1 of the split of #260.

Scope:

  • keep the unified-platform structure and shared API surface
  • keep the baseline nonparametric support already present in PMcore
  • remove the parametric stack and the newer nonparametric families from this review

Why:

  • separate the structural rewrite from the algorithm additions
  • make the follow-up parametric and expanded nonparametric reviews substantially smaller

Validation:

  • cargo check --tests --examples

Follow-up PRs will layer the parametric and expanded nonparametric algorithm families back on top of this base.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

🐰 Bencher Report

Branchsplit/unified-platform-base
Testbedmhovd-pgx
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
bimodal_ke_npag📈 view plot
🚷 view threshold
4,898.30 ms
(+0.06%)Baseline: 4,895.25 ms
5,086.30 ms
(96.30%)
bimodal_ke_npod📈 view plot
🚷 view threshold
1,447.50 ms
(+0.16%)Baseline: 1,445.12 ms
1,727.29 ms
(83.80%)
bimodal_ke_postprob📈 view plot
🚷 view threshold
296.31 ms
(-25.25%)Baseline: 396.40 ms
576.26 ms
(51.42%)
🐰 View full continuous benchmarking report in Bencher

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is the first step in splitting the unified-platform work: it preserves the unified modeling/compilation layer and baseline nonparametric algorithms, while removing the parametric stack and newer algorithm families to keep follow-up reviews smaller.

Changes:

  • Introduces a new public API surface (api, model, compile, results, output) and updates tests/examples to use EstimationProblem + ModelDefinition.
  • Moves/rewires nonparametric estimation components under estimation::nonparametric and removes legacy routines/ + structs/ modules.
  • Adds shared result/summary/diagnostics/artifact tracking and shared output writers for the new architecture.

Reviewed changes

Copilot reviewed 103 out of 104 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/structs_tests.rs Updates imports to the new nonparametric module path.
tests/settings_tests.rs Removes legacy Settings-based tests (old API removed).
tests/results_summary_tests.rs Adds smoke test for new summary/diagnostics/predictions surface.
tests/output_writer_tests.rs Adds test coverage for writing shared output artifacts via FitResult::write_outputs.
tests/onecomp.rs Migrates algorithm dispatch tests to EstimationProblem builder API.
tests/nonparametric_engine_tests.rs Adds coverage for NonparametricEngine::fit returning a workspace.
tests/ipm_tests.rs Updates Burke IPM tests to new module path.
tests/compile_layer_tests.rs Adds coverage for compile layer (design context + observation index + covariates).
tests/api_smoke_tests.rs Adds basic smoke coverage for new API entrypoints.
tests/acceptance_baseline_tests.rs Adds an acceptance baseline for NPAG (bimodal_ke).
src/structs/mod.rs Removes legacy structs module (moved to estimation::nonparametric).
src/routines/mod.rs Removes legacy routines module (superseded by new architecture).
src/routines/initialization/sobol.rs Removes legacy Sobol initializer (moved under new nonparametric prior).
src/routines/initialization/latin.rs Removes legacy Latin initializer (moved under new nonparametric prior).
src/routines/expansion/mod.rs Removes legacy expansion module.
src/routines/expansion/adaptative_grid.rs Removes legacy adaptive grid implementation (reintroduced under new nonparametric).
src/routines/estimation/qr.rs Removes legacy QR routine (reintroduced under new nonparametric).
src/routines/estimation/mod.rs Removes legacy estimation routines module.
src/routines/condensation/mod.rs Removes legacy condensation helper.
src/results/summary.rs Adds shared summary DTOs (fit + population + individual).
src/results/predictions.rs Adds shared predictions bundle metadata for results.
src/results/mod.rs Introduces results module exports and internal helpers.
src/results/fit_result.rs Adds FitResult enum and high-level accessors (summary, diagnostics, artifacts, etc.).
src/results/diagnostics.rs Adds diagnostics bundle and nonparametric diagnostics builder.
src/results/artifacts.rs Adds artifact indexing for output files (expected/actual/missing).
src/output/writer.rs Adds unified writer entrypoint for result outputs + shared predictions CSV.
src/output/shared.rs Adds shared output writers (settings/summary/diagnostics/csv rows) and file name list.
src/output/nonparametric.rs Adds method-specific output writers and expected file enumeration.
src/output/mod.rs Introduces output module layout and re-exports.
src/output/logging.rs Refactors logging setup to use new OutputPlan + LoggingOptions.
src/output/file.rs Adds new OutputFile helper with directory creation and context.
src/model/variability.rs Adds variability/random-effects spec types for the new model domain.
src/model/parameter_space.rs Adds parameter-space modeling types and finite-range validation.
src/model/observation_spec.rs Adds observation channel spec and error-model attachment surface.
src/model/mod.rs Introduces ModelDefinition + builder and exports model-domain types.
src/model/metadata.rs Adds model metadata container.
src/model/covariates.rs Adds covariate spec types (in-equation vs structured).
src/lib.rs Replaces legacy public API with new modules + prelude exports.
src/estimation/nonparametric/workspace.rs Adds workspace container and output-writing helpers for nonparametric fit results.
src/estimation/nonparametric/weights.rs Minor cleanup around Weights API docs/structure.
src/estimation/nonparametric/summaries.rs Adds fit/population/individual summary builders.
src/estimation/nonparametric/statistics.rs Adds statistics helpers used by summaries/predictions.
src/estimation/nonparametric/qr.rs Reintroduces QR decomposition helper under new module layout.
src/estimation/nonparametric/psi.rs Adds ndarray conversion helper and keeps CSV helpers/serialization.
src/estimation/nonparametric/prior/sobol.rs Adds Sobol prior generator using ParameterSpace::finite_ranges.
src/estimation/nonparametric/prior/latin.rs Adds Latin hypercube prior generator using finite ranges.
src/estimation/nonparametric/prior.rs Adds Prior enum + prior parsing and sampling entrypoints.
src/estimation/nonparametric/predictions.rs Updates predictions machinery to new module paths.
src/estimation/nonparametric/posterior.rs Routes posterior helper through Posterior::calculate.
src/estimation/nonparametric/mod.rs Defines nonparametric module surface and re-exports.
src/estimation/nonparametric/expansion.rs Reintroduces adaptive grid expansion under new module layout.
src/estimation/nonparametric/engine.rs Adds NonparametricEngine bridging compiled problems to algorithm execution.
src/estimation/nonparametric/cycles.rs Updates cycle logging to write iterations.csv and accept new inputs.
src/estimation/mod.rs Adds estimation module root (currently nonparametric only).
src/compile/validation.rs Adds compile-time validation for problems.
src/compile/observation_index.rs Adds observation index DTO.
src/compile/mod.rs Adds compile pipeline and EstimationProblem::compile/initialize_logs.
src/compile/design_context.rs Adds design context + structured covariate design DTOs.
src/compile/compiled_problem.rs Adds compiled problem container used by engines.
src/compile/caches.rs Adds execution cache flags container.
src/bestdose/predictions.rs Updates BestDose prediction interval handling and imports for new architecture.
src/bestdose/posterior.rs Updates BestDose posterior/refinement to new config/runtime wiring.
src/bestdose/optimization.rs Updates imports to new Weights path.
src/bestdose/cost.rs Updates docs and makes calculate_cost public.
src/api/model_definition.rs Re-exports model definition types into API module.
src/api/mod.rs Introduces API module exports (fit, problem builder/types).
src/api/fit.rs Adds fit(problem) entrypoint with optional logging init + compile + run.
src/api/estimation_problem.rs Adds new unified problem/method/runtime/output/logging option types and builder.
src/algorithms/nonparametric/postprob.rs Rewires POSTPROB to produce NonparametricWorkspace and use native config input.
src/algorithms/nonparametric/npod.rs Rewires NPOD to new config/input plumbing and workspace output.
src/algorithms/nonparametric/npag.rs Rewires NPAG to new config/input plumbing and workspace output.
src/algorithms/nonparametric/mod.rs Adds a dedicated nonparametric algorithms module organization.
src/algorithms/mod.rs Adds algorithm input/config plumbing and replaces old dispatch with run_nonparametric_algorithm.
examples/vanco_sde/main.rs Migrates example to new API/model/problem setup.
examples/two_eq_lag/main.rs Migrates example to new API/model/problem setup.
examples/theophylline/main.rs Migrates example to new API/model/problem setup.
examples/new_iov/main.rs Migrates example to new API/model/problem setup.
examples/neely/main.rs Migrates example to new API/model/problem setup.
examples/meta/main.rs Migrates example to new API/model/problem setup.
examples/iov/main.rs Migrates example to new API/model/problem setup.
examples/drusano/main.rs Migrates example to new API/model/problem setup.
examples/drusano/config.toml Removes legacy config file (old Settings flow removed).
examples/bimodal_ke/main.rs Migrates example to new API/model/problem setup and logging options.
examples/bimodal_ke/config.toml Removes legacy config file (old Settings flow removed).
examples/bestdose.rs Migrates BestDose example to new config + read_prior.
examples/bestdose_cov.rs Adds a new BestDose covariate example using the new API.
examples/bestdose_bounds.rs Migrates BestDose bounds example to new config + read_prior.
examples/bestdose_auc.rs Migrates BestDose AUC example to new config + read_prior.
Cargo.toml Adds dependencies and changes test profile configuration.
benches/bimodal_ke.rs Migrates benchmark to run through EstimationProblem instead of legacy dispatch/settings.
.gitignore Updates ignored output/artifact files for the new output layout.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +165 to +175
for result in reader.records() {
let record = result.unwrap();
let values: Vec<f64> = reordered_indices
.iter()
.map(|&i| record[i].parse::<f64>().unwrap())
.collect();
theta_values.push(values);

if let Some(prob_idx) = prob_index {
let prob_value: f64 = record[prob_idx].parse::<f64>().unwrap();
prob_values.push(prob_value);
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_prior_for_parameters uses multiple unwrap() calls when iterating CSV records and parsing floats. This can panic on malformed CSVs or non-numeric values even though the function returns Result, which makes prior loading crash the caller instead of returning a helpful error. Replace the unwrap()s with ? (and add context including row/column/field name) so invalid priors are reported as regular errors.

Copilot uses AI. Check for mistakes.
Comment on lines +182 to +188
writer.write_field("id")?;
writer.write_field("point")?;
self.theta.param_names().iter().for_each(|name| {
writer.write_field(name).unwrap();
});
writer.write_field("prob")?;
writer.write_record(None::<&[u8]>)?;
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_posterior mixes ? with several unwrap()s inside iterator closures (e.g., writer.write_field(...).unwrap() and writer.write_record(...).unwrap()). Any I/O/CSV serialization error will panic instead of being returned via anyhow::Result. Refactor these for_each closures into for loops so you can propagate errors with ? throughout.

Copilot uses AI. Check for mistakes.
Comment on lines +190 to +209
let subjects = self.data.subjects();
self.posterior
.matrix()
.row_iter()
.enumerate()
.for_each(|(i, row)| {
let subject = subjects.get(i).unwrap();
let id = subject.id();

row.iter().enumerate().for_each(|(spp, prob)| {
writer.write_field(id.clone()).unwrap();
writer.write_field(spp.to_string()).unwrap();

self.theta.matrix().row(spp).iter().for_each(|val| {
writer.write_field(val.to_string()).unwrap();
});

writer.write_field(prob.to_string()).unwrap();
writer.write_record(None::<&[u8]>).unwrap();
});
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More unwrap()s in the nested loops of write_posterior can panic on write errors and also hide potential mismatches (e.g., subjects.get(i).unwrap()). Prefer explicit loops returning Result and use ? for CSV writes; for indexing, return a descriptive error if the subject/posterior row counts ever diverge.

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +57
pub fn median(data: &[f64]) -> f64 {
let mut data: Vec<f64> = data.to_vec();
data.sort_by(|a, b| a.partial_cmp(b).unwrap());

let size = data.len();
match size {
even if even % 2 == 0 => {
let fst = data.get(even / 2 - 1).unwrap();
let snd = data.get(even / 2).unwrap();
(fst + snd) / 2.0
}
odd => *data.get(odd / 2_usize).unwrap(),
}
}

pub fn weighted_median(data: &[f64], weights: &[f64]) -> f64 {
assert_eq!(
data.len(),
weights.len(),
"The length of data and weights must be the same"
);
assert!(
weights.iter().all(|&x| x >= 0.0),
"Weights must be non-negative, weights: {:?}",
weights
);

let mut weighted_data: Vec<(f64, f64)> = data
.iter()
.zip(weights.iter())
.map(|(&d, &w)| (d, w))
.collect();

weighted_data.sort_by(|a, b| a.0.partial_cmp(&b.0).unwrap());

let total_weight: f64 = weights.iter().sum();
let mut cumulative_sum = 0.0;

for (i, &(_, weight)) in weighted_data.iter().enumerate() {
cumulative_sum += weight;

if cumulative_sum == total_weight / 2.0 {
if i + 1 < weighted_data.len() {
return (weighted_data[i].0 + weighted_data[i + 1].0) / 2.0;
} else {
return weighted_data[i].0;
}
} else if cumulative_sum > total_weight / 2.0 {
return weighted_data[i].0;
}
}

unreachable!("The function should have returned a value before reaching this point.");
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

median/weighted_median sort using partial_cmp(...).unwrap() and use assert!/unreachable! for validation. These functions are publicly re-exported (part of the library API) and will panic if inputs contain NaNs, negative weights, mismatched lengths, or total weight is 0. Consider returning Result<f64> (or filtering/ordering NaNs explicitly) and validate total_weight > 0.0 to avoid panics in normal error cases.

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +84
pub fn population_mean_median(
theta: &Array2<f64>,
w: &Array1<f64>,
) -> Result<(Array1<f64>, Array1<f64>)> {
let w = if w.is_empty() {
tracing::warn!("w.len() == 0, setting all weights to 1/n");
Array1::from_elem(theta.nrows(), 1.0 / theta.nrows() as f64)
} else {
w.clone()
};

if theta.nrows() != w.len() {
bail!(
"Number of parameters and number of weights do not match. Theta: {}, w: {}",
theta.nrows(),
w.len()
);
}

let mut mean = Array1::zeros(theta.ncols());
let mut median = Array1::zeros(theta.ncols());

for (i, (mn, mdn)) in mean.iter_mut().zip(&mut median).enumerate() {
let col = theta.column(i).to_owned() * w.to_owned();
*mn = col.sum();

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

population_mean_median computes the weighted mean as sum(theta * w) without normalizing w. Weights::from_vec (and priors loaded from CSV) do not guarantee weights sum to 1.0, so means (and downstream SD/CV calculations) can be scaled incorrectly. Either normalize w inside these statistics helpers (e.g., w / w.sum()) or enforce normalization when constructing Weights from external inputs.

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +40
let has_covariates = result.data().subjects().iter().any(|subject| {
subject
.occasions()
.iter()
.any(|occasion| !occasion.covariates().covariates().is_empty())
});
if has_covariates {
files.push("covariates.csv".to_string());
}

files.sort();
files.dedup();
files
}

pub fn write_nonparametric_outputs<E: pharmsol::Equation>(
result: &mut NonparametricWorkspace<E>,
) -> Result<()> {
let parameter_names = result.get_theta().parameters().names();
result
.cycle_log()
.write(result.output_folder(), &parameter_names)?;
result.write_theta()?;
result.write_covariates()?;
result.write_posterior()?;
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_nonparametric_outputs always calls result.write_covariates()?, but output_file_names only includes covariates.csv when has_covariates is true. This makes the artifact/expected-files model inconsistent (a file may be written but never tracked as expected). Either gate write_covariates behind the same has_covariates condition or always include covariates.csv in output_file_names.

Copilot uses AI. Check for mistakes.
Comment thread Cargo.toml Outdated
Comment on lines +35 to +38
cobyla = "0.8.0"
rand_distr = "0.5.1"
rand_chacha = "0.9.0"
statrs = "0.18"
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New dependencies (cobyla, rand_distr, rand_chacha, statrs) appear unused in the current tree (no references in src/, tests/, examples/, or benches/). Unused deps increase compile time and supply-chain surface area. If they’re intended for a follow-up PR, consider deferring them until they’re needed or gating behind a feature.

Suggested change
cobyla = "0.8.0"
rand_distr = "0.5.1"
rand_chacha = "0.9.0"
statrs = "0.18"

Copilot uses AI. Check for mistakes.
Comment thread Cargo.toml Outdated
Comment on lines +48 to +50
[profile.test]
inherits = "release"

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[profile.test] inheriting from release changes test behavior (e.g., disables debug assertions/overflow checks and can make failures harder to diagnose). If the goal is faster tests/bench-like runs, consider keeping the default debug test profile and introducing a separate CI profile/feature for performance runs, or document why release-mode tests are required here.

Suggested change
[profile.test]
inherits = "release"

Copilot uses AI. Check for mistakes.
Comment thread src/bestdose/cost.rs
Comment on lines 129 to 133
/// - Model simulation fails
/// - Prediction length doesn't match observation count
/// - AUC calculation fails (for AUC targets)
pub(crate) fn calculate_cost(problem: &BestDoseProblem, candidate_doses: &[f64]) -> Result<f64> {
pub fn calculate_cost(problem: &BestDoseProblem, candidate_doses: &[f64]) -> Result<f64> {
// Validate candidate_doses length matches expected optimizable dose count
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculate_cost was changed from pub(crate) to pub while bestdose::cost is a public module. This effectively promotes an internal helper to part of the crate’s public API surface. If this isn’t meant to be supported long-term, keep it crate-private (or mark it clearly as internal) to avoid committing to API stability for a low-level routine.

Copilot uses AI. Check for mistakes.
@Siel
Copy link
Copy Markdown
Member Author

Siel commented Apr 28, 2026

@copilot resolve the merge conflicts in this pull request

Copy link
Copy Markdown

Copilot AI commented Apr 28, 2026

@copilot resolve the merge conflicts in this pull request

Resolved in a467cc3. I merged origin/main into this branch, kept the unified-platform API/example structure from this PR, and validated with cargo check --tests --examples. Code review passed; CodeQL timed out during validation.

Copilot finished work on behalf of Siel April 28, 2026 14:51
@Siel Siel closed this Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants