Skip to content

Commit a71cc4d

Browse files
author
miranov25
committed
docs(restartContext): record diagnostics integration and real-data validation
Added suffix-aware summarize_diagnostics + benchmark report integration Confirmed robust re-fit loop in real datasets Prepared next-phase plan for real-use-case profiling and fast-path study
1 parent aa024b0 commit a71cc4d

File tree

2 files changed

+78
-7
lines changed

2 files changed

+78
-7
lines changed

UTILS/dfextensions/groupby_regression.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,29 @@ Even at **30% response outliers**, runtime remains essentially unchanged (no rob
190190
To emulate worst-case slowdowns seen on real data, a **leverage-outlier** mode (X-contamination) will be added in a follow-up.
191191

192192

193+
### Diagnostic Summary Utilities
194+
195+
The regression framework can optionally emit per-group diagnostics when `diag=True`
196+
is passed to `make_parallel_fit()`.
197+
198+
Diagnostics include:
199+
200+
| Field | Meaning |
201+
|:------|:--------|
202+
| `diag_time_ms` | Wall-time spent per group (ms) |
203+
| `diag_n_refits` | Number of extra robust re-fits required |
204+
| `diag_frac_rejected` | Fraction of rejected points after sigma-cut |
205+
| `diag_cond_xtx` | Condition number proxy for design matrix |
206+
| `diag_hat_max` | Maximum leverage in predictors |
207+
| `diag_n_rows` | Number of rows in the group |
208+
209+
Summaries can be generated directly:
210+
211+
```python
212+
summary = GroupByRegressor.summarize_diagnostics(dfGB, diag_prefix="diag_", suffix="_fit")
213+
print(GroupByRegressor.format_diagnostics_summary(summary))
214+
```
215+
193216
### Interpretation
194217

195218
* The **OLS path** scales linearly with group count.

UTILS/dfextensions/restartContext_groupby_regression.md

Lines changed: 55 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -119,15 +119,63 @@ python3 bench_groupby_regression.py \
119119
---
120120

121121
**Last updated:** Oct 22, 2025 (this revision)
122+
# Restart Context: GroupBy Regression Benchmarking & Diagnostics Integration
123+
124+
**Date:** October 23 2025
125+
**Project:** dfextensions (ALICE O2 Physics)
126+
**Focus:** `groupby_regression.py` — diagnostic instrumentation and benchmark integration
127+
**Next Phase:** Real-data performance characterization
122128

123129
---
124130

125-
### Commit message
131+
## Summary of Latest Changes
132+
133+
* **Diagnostics added to core class**
134+
- `GroupByRegressor.summarize_diagnostics()` and `format_diagnostics_summary()` now compute mean/median/std + quantiles (p50–p99) for all key diagnostic metrics (`time_ms`, `n_refits`, `frac_rejected`, `cond_xtx`, `hat_max`, `n_rows`).
135+
- Handles both prefixed (`diag_…`) and suffixed (`…_fit`, `…_dIDC`) columns.
136+
137+
* **Benchmark integration**
138+
- `bench_groupby_regression.py` now:
139+
- Calls class-level summary after each scenario.
140+
- Writes per-scenario `diag_summary.csv` and appends human-readable summaries to `benchmark_report.txt`.
141+
- Saves `diag_top10_time__<scenario>.csv` and `diag_top10_refits__<scenario>.csv` for quick inspection.
142+
- Default benchmark: `--rows-per-group 5 --groups 1000 --diag`.
143+
144+
* **Validation**
145+
- Real-data summary confirmed correct suffix handling (`_dIDC`).
146+
- Pytest and all synthetic benchmarks pass.
147+
148+
---
149+
150+
## Observations from Real Data
151+
152+
* Median per-group fit time ≈ 7 ms (p99 ≈ 12 ms).
153+
* ~99 % of groups perform 3 robust re-fits → robust loop fully active.
154+
* Only ~2 % mean rejection fraction, but 99th percentile ≈ 0.4 → a few heavy-outlier bins drive cost.
155+
* Conditioning (cond_xtx ≈ 1) and leverage (hat_max ≈ 0.18) are stable → slowdown dominated by the sigmaCut iteration.
156+
157+
---
158+
159+
## Next Steps (Real-Use-Case Phase)
160+
161+
1. **Collect diagnostic distributions on full calibration samples**
162+
- Export `diag_full__*` and `diag_top10_*` CSVs.
163+
- Aggregate with `summarize_diagnostics()` to study tails and correlations.
164+
165+
2. **Benchmark subsets vs. full parallel runs**
166+
- Quantify the gain observed when splitting into smaller chunks (cache + spawn effects).
167+
168+
3. **Add leverage-outlier generator** to reproduce re-fit behaviour in synthetic benchmarks.
169+
170+
4. **Consider optimization paths**
171+
- Cap `max_refits` / early-stop criterion.
172+
- Introduce `make_parallel_fitFast` minimal version for groups O(10).
173+
174+
5. **Documentation**
175+
- Update `groupby_regression.md` “Performance & Benchmarking” section with diagnostic summary example and reference to top-violator CSVs.
176+
177+
---
178+
179+
**Last updated:** Oct 23 2025
126180

127-
```
128-
docs(restartContext): update with 5k/5 default, 30% outliers, and leverage-outlier plan
129181

130-
- Record new cross-platform results (Mac vs Linux) and observation that response-only outliers do not slow runtime
131-
- Add action plan: leverage-outlier generator + refit counters + multi-target cost check
132-
- Keep PR target; align benchmarks and docs with 5k/5 default
133-
```

0 commit comments

Comments
 (0)