Skip to content

docs: correct stale R-comparison-tests TODO row (R not in CI)#535

Open
igerber wants to merge 1 commit into
mainfrom
perf/rscript-consolidation
Open

docs: correct stale R-comparison-tests TODO row (R not in CI)#535
igerber wants to merge 1 commit into
mainfrom
perf/rscript-consolidation

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Jun 7, 2026

Summary

  • Correct a stale TODO row (Testing/Docs, Fix TWFE within-transformation bug and add methodology review #139): "R comparison tests spawn separate Rscript per test (slow CI)". The premise is false — no CI workflow installs R (no setup-r / r-lib/actions / fixest / r-base anywhere in .github/workflows/), so every R-parity test skips in CI behind a per-file availability gate (fixest_available, _check_r_contdid(), require_r/r_available). Consolidating Rscript spawns therefore yields zero CI speedup.
  • The originally-cited file (tests/test_methodology_twfe.py) already session-caches its R fits via @pytest.fixture(scope="session") (r_twfe_results / r_twfe_results_with_covariate).
  • Rewrite the row to document the corrected scope: the only residual is a local-dev-only micro-optimization (continuous_did.py / callaway.py re-spawn library(...) per call with no session cache). Retained as a low-value local-dev note rather than silently dropped, so it isn't re-raised as a CI-speed item.
  • Drop the already-shipped "doc-deps integrity CI" (PR chore: add CI integrity check for docs/doc-deps.yaml #519) from the dangling Tier-D parenthetical — its dedicated row was removed when chore: add CI integrity check for docs/doc-deps.yaml #519 merged but the mention was left behind.

Methodology references

  • N/A — documentation-only change (TODO.md). No estimator, identification, weighting, or variance/SE behavior is touched.

Validation

  • No test changes. Claims were verified against the repo before writing: grep confirms no R in .github/workflows/; the twfe session-scoped fixtures, the continuous_did / callaway call sites, and the removed chore: add CI integrity check for docs/doc-deps.yaml #519 doc-deps row/test (tests/test_doc_deps_integrity.py exists) were each confirmed.
  • Local AI review (codex, gpt-5.5, full standard, --force-fresh): ✅ clean — no P0/P1/P2; one P3-informational note confirming the reframing is correct.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 7, 2026

✅ Looks good

Executive Summary

  • Documentation-only PR touching TODO.md:L177 and TODO.md:L224; no estimator, weighting, identification, variance/SE, or default behavior changes.
  • Methodology registry/source-paper cross-check is not triggered by the diff surface.
  • The “no CI impact” reframing is supported by workflow searches: no R setup/install workflow was found.
  • TWFE R parity results are already session-cached as stated.
  • One P3 informational documentation-completeness note: the rewritten local-dev caching row omits another Callaway R helper that also respawns Rscript.

Methodology

No findings.

Severity: P3
Impact: No methodology behavior is changed. The diff only edits deferred-work prose in TODO.md:L177 and backlog wording in TODO.md:L224, so there is no estimator/paper mismatch to evaluate against docs/methodology/REGISTRY.md.
Concrete fix: None required.

Code Quality

No findings.

Severity: P3
Impact: No production code changed.
Concrete fix: None required.

Performance

Finding: Callaway local-dev R caching note is slightly incomplete.

Severity: P3
Impact: TODO.md:L177 correctly reframes the item as local-dev-only, but it names _run_r_estimation in tests/test_methodology_callaway.py:L337-L409 and its three callers while omitting _get_r_mpdta_and_results, which also runs Rscript at tests/test_methodology_callaway.py:L1595-L1688 and is called by multiple MPDTA R-parity tests. This does not affect CI or correctness, but a future cleanup could miss part of the same local-dev performance debt.
Concrete fix: Extend the TODO row to also mention _get_r_mpdta_and_results / MPDTA strict benchmark tests as another local-dev session-cache candidate.

Maintainability

No blocking findings.

Severity: P3
Impact: The rewritten row is long, but it is acceptable for a debt ledger and materially more precise than the stale CI-speed claim.
Concrete fix: Optional: split the row into shorter clauses or mention the helper names more compactly.

Tech Debt

No blocking findings.

Severity: P3
Impact: The PR keeps the item tracked under “Tech Debt from Code Reviews,” and removes the stale Tier-D doc-deps integrity CI mention at TODO.md:L224. This is compatible with the project’s deferred-work policy.
Concrete fix: None required.

Security

No findings.

Severity: P3
Impact: No executable workflow or code paths changed; no secrets/PII introduced in the diff.
Concrete fix: None required.

Documentation/Tests

No blocking findings.

Severity: P3
Impact: No tests are needed for this documentation-only correction. Repository inspection supports the main claims: TWFE has session-scoped R fixtures at tests/test_methodology_twfe.py:L419-L434, R availability gates exist in tests/test_methodology_twfe.py:L37-L68, tests/test_methodology_continuous_did.py:L243-L260, and tests/conftest.py:L30-L90, and tests/test_doc_deps_integrity.py exists.
Concrete fix: None required beyond the optional Callaway helper wording above.

…c-deps mention

The "R comparison tests spawn separate Rscript per test (slow CI)" row's premise
is false: no CI workflow installs R (no setup-r / r-lib/actions / fixest / r-base
in .github/workflows/), so every R-parity test skips in CI behind a per-file
availability gate. The originally-cited test_methodology_twfe.py already
session-caches its R fits via scope="session" fixtures. Rewrite the row to
document the correction (the only residual is a local-dev-only micro-optimization
in continuous_did / callaway, which re-spawn library() per call with no session
cache) and drop the already-shipped "doc-deps integrity CI" (#519) from the
dangling Tier D parenthetical.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the perf/rscript-consolidation branch from bbb7cf5 to 6290ca1 Compare June 7, 2026 18:06
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 7, 2026

🔁 AI review rerun (requested by @igerber)

Head SHA: 6290ca1c6de4568d63c0b590f11dd42cd66735b2


✅ Looks Good

Executive Summary

  • Re-review scope: the prior P3 note is addressed; TODO.md:L177 now includes _get_r_mpdta_and_results.
  • This PR only changes TODO.md; no estimator, weighting, variance/SE, inference, assumptions, or defaults changed.
  • Methodology cross-check against docs/methodology/REGISTRY.md is not triggered by the diff surface.
  • The corrected “no CI impact” framing is supported by workflow grep and R availability gates.
  • The stale doc-deps integrity CI Tier-D mention is removed at TODO.md:L224.

Methodology

Finding: None.

Severity: P3
Impact: No methodology behavior changed. The diff edits only deferred-work wording in TODO.md:L177 and TODO.md:L224.
Concrete fix: None required.

Code Quality

Finding: None.

Severity: P3
Impact: No production or test code changed.
Concrete fix: None required.

Performance

Finding: Previous P3 resolved.

Severity: P3
Impact: The row now correctly scopes the residual to local-dev-only R parity caching and names both Callaway helpers: _run_r_estimation at tests/test_methodology_callaway.py:L337 and _get_r_mpdta_and_results at tests/test_methodology_callaway.py:L1595.
Concrete fix: None required.

Maintainability

Finding: None blocking.

Severity: P3
Impact: The TODO row is long, but it is precise and removes a misleading CI-speed premise.
Concrete fix: Optional only: split the row for readability if the table becomes hard to maintain.

Tech Debt

Finding: None.

Severity: P3
Impact: The PR keeps the residual work tracked under “Tech Debt from Code Reviews” and corrects the Tier-D rollup wording at TODO.md:L224.
Concrete fix: None required.

Security

Finding: None.

Severity: P3
Impact: Documentation-only change; no executable workflow/code paths or secrets introduced.
Concrete fix: None required.

Documentation/Tests

Finding: No blocking documentation or test issues.

Severity: P3
Impact: Repository inspection supports the updated wording: no R install/setup pattern appears in .github/workflows/; TWFE already uses session-scoped fixtures at tests/test_methodology_twfe.py:L419-L434; R gates exist at tests/test_methodology_twfe.py:L37-L68, tests/test_methodology_continuous_did.py:L243-L260, and tests/conftest.py:L30-L92. The current GitHub Ubuntu 24.04 runner image inventory also lists common runtimes but not R, supporting the “no CI impact” claim. (github.com)
Concrete fix: None required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant