docs: add agentic CI plan for automated PR reviews and daily maintenance#473
docs: add agentic CI plan for automated PR reviews and daily maintenance#473andreatgretel wants to merge 7 commits intomainfrom
Conversation
eric-tramel
left a comment
There was a problem hiding this comment.
Excellent initiative, let's do it 🚀
|
|
||
| Constraints: | ||
| - Only runs on non-draft PRs | ||
| - Skips if the PR only touches docs/markdown (configurable per recipe) |
There was a problem hiding this comment.
We should probably have agent workflow reviews in this case, too (e.g. keeping docs in sync, making sure the edits are faithful to the codebase.
| Running all suites every day is technically possible (stagger them hourly, e.g., | ||
| 06:00-10:00 UTC) and would surface issues faster. But it creates up to five | ||
| PR/issue streams per day, which risks becoming noise the team tunes out - the | ||
| opposite of the goal. One suite per day keeps the output digestible and gives | ||
| each finding proper attention. |
There was a problem hiding this comment.
Once we have trust in the system, especially crossing the Rubicon on automated merging of simple edits (e.g. fix a doc, etc), then increasing the rate of review will be valuable.
|
|
||
| #### Wednesday / structure | ||
|
|
||
| Enforces the multi-package layering that makes DataDesigner work. |
There was a problem hiding this comment.
Could also include the import / cli bootup time checks that @nabinchha had added previously.
There was a problem hiding this comment.
We have a test for it already, so it should be covered...?
| | Wed | structure | import boundaries, circular deps, dead exports | | ||
| | Thu | code-quality | complexity, exception hygiene, type gaps, TODO aging | | ||
| | Fri | test-health | coverage deltas, hollow tests, fixtures, smoke tests | | ||
| | Sat/Sun | off | - | |
There was a problem hiding this comment.
We should reserve Sat/Sun for longer performance benchmarking and AI-QA tests. This isn't fully built out yet, but that will be the natural thing to add to the automation once it is (e.g. measuring mocked execution times, memory overhead, hotspot detection).
Additionally, AI-QA test would be: let the agent go through and try to construct SDG workflows and then execute them, record friction and problems.
|
Nice work on this one, @andreatgretel — this is a thorough and well-structured plan. Here are my thoughts. SummaryThis PR adds a comprehensive plan for introducing agentic CI to DataDesigner: GitHub Actions workflows that run Claude Code or Codex on a self-hosted runner to perform automated PR reviews and rotating daily maintenance audits. The plan covers architecture (recipe format, directory layout), security (prompt injection, minimal permissions), phased rollout, and runner memory for cross-run dedup. The implementation matches the stated intent in the PR description and closes #472. FindingsWarnings — Worth addressing
Suggestions — Take it or leave it
What Looks Good
VerdictNeeds changes — The memory storage approach and the docs-skip behavior are worth resolving before merge. None of these require major restructuring — they're refinements to an already solid plan. |
| | Wed | structure | import boundaries, circular deps, dead exports | | ||
| | Thu | code-quality | complexity, exception hygiene, type gaps, TODO aging | | ||
| | Fri | test-health | coverage deltas, hollow tests, fixtures, smoke tests | | ||
| | Sat/Sun | off | - | |
There was a problem hiding this comment.
Weekends seem like a good time for agents to be busy. Something to think about as we evolve this.
| Keeps the dependency graph healthy and secure. | ||
|
|
||
| - **Version pinning audit**: compare pinned versions in all three `pyproject.toml` | ||
| files against latest available. Prefer strict pins (`==`) over loose (`>=`) for |
There was a problem hiding this comment.
strict pins are tricky, since we also want to balance with UX/DX, though the recent litellm issue should give us pause
|
|
||
| --- | ||
|
|
||
| ### Suite Details |
There was a problem hiding this comment.
The proposed suites cover code and docs, which is great. Perhaps we should also create a suite for repo maintenance like analyzing open issues and PRs and creating a report for us to review each week. Eventually, we'll add issue solving to the list too.
📋 Summary
Plan for adding an agentic CI layer to the repo: GitHub Actions workflows that run Claude Code or Codex on a self-hosted runner to review PRs and run daily tech debt maintenance. Closes #472.
🔄 Changes
✨ Added
plans/472/agentic-ci-plan.md- full plan covering:.agents/recipes/)review-codeskill🔍 Attention Areas
plans/472/agentic-ci-plan.md- This is a plan-only PR (no implementation). Review for feasibility, security concerns, and whether the suite coverage and phasing make sense for the team.🤖 Generated with AI