diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 1c2721e..29d69d3 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -74,7 +74,26 @@ Edit the relevant `SKILL.md` or data file. Test by running the skill locally wit ## Testing -There is no automated test harness for skills — they are instruction sets interpreted by Claude Code, not code with unit tests. The validation steps are: +This repo ships a tiered test suite under `tests/`. Run it with: + +```bash +./tests/run-tests.sh # all fast tests (tier-2 invariants + tier-1 skill tests) +./tests/run-tests.sh --verbose # show per-assertion output +./tests/run-tests.sh --test test-install-workflow.sh # single test file +``` + +See [tests/README.md](tests/README.md) for the full tier breakdown (tier-2: grep invariants <1s; tier-1: Claude-invoked skill assertions ~4-5min; tier-3: manual E2E). + +**Before opening a PR**, at minimum run the tier-2 invariants and the tier-1 test for the skill you changed: + +| Changed area | Test to run | +|---|---| +| `skills/discover-workflows/` | `./tests/run-tests.sh --test test-discover-workflows.sh` | +| `skills/install-workflow/` or `skills/install-workflow/auth.md` | `./tests/run-tests.sh --test test-install-workflow.sh` | +| `skills/install-agent-team/` | `./tests/run-tests.sh --test test-install-agent-team.sh` | +| Any skill | `./tests/run-tests.sh --test test-invariants.sh` (always, <1s, no Claude cost) | + +Additional manual steps: 1. **Load the plugin**: `claude --plugin-dir .` — confirm no startup errors. 2. **Run the skill manually**: invoke `/discover-workflows` or `/install-workflow` and walk through the flow. diff --git a/README.md b/README.md index e18e51a..2bd5179 100644 --- a/README.md +++ b/README.md @@ -123,6 +123,18 @@ catalog/ `.lock.yml` files are marked `linguist-generated` and `merge=ours` in `.gitattributes` to prevent spurious merge conflicts. +## Publishing + +> Maintainers only — requires write access to the plugin registries. + +After merging a release PR: + +1. **Bump the version** in `.claude-plugin/plugin.json` and `.claude-plugin/marketplace.json` (both must match). +2. **Create a GitHub Release** tagged `v` with a changelog summary. +3. **Submit to claude-plugins.dev** and **ClaudePluginHub** — update the listing URL if the marketplace JSON path changed. Both registries poll the `marketplace.json` URL for metadata. + +The marketplace listing points directly at the raw `marketplace.json` on `main`, so consumers always get the latest plugin.json on install — no registry re-submission needed for non-breaking changes that don't alter the marketplace schema. + ## Credits Built on two open-source projects from the [GitHub Next](https://githubnext.com) team: diff --git a/catalog/agent-team/README.md b/catalog/agent-team/README.md index 5490837..40c138b 100644 --- a/catalog/agent-team/README.md +++ b/catalog/agent-team/README.md @@ -63,12 +63,14 @@ Sections: `spec`, `plan`, `review`. Each carries the `iteration` at the time it ## Files -| File | Trigger | Dispatches next | -|---|---|---| -| `spec-agent.md` | `issues.labeled` with `agent-team` (fresh issue) | `planner-agent` (issue_number, iteration=1) | -| `planner-agent.md` | `workflow_dispatch` (issue_number, iteration) | `implementer-agent` (issue_number, iteration) | -| `implementer-agent.md` | `workflow_dispatch` (issue_number, iteration, pr_number?) | `reviewer-agent` (issue_number, pr_number, iteration) | -| `reviewer-agent.md` | `workflow_dispatch` (pr_number, issue_number, iteration) | `implementer-agent` on kickback (iteration+1), else nothing | +| File | Trigger | Required inputs | Optional inputs | Dispatches next | +|---|---|---|---|---| +| `spec-agent.md` | `issues.labeled` with `agent-team` (fresh issue) | — | — | `planner-agent` (issue_number, iteration=1) | +| `planner-agent.md` | `workflow_dispatch` | issue_number | iteration (default "1") | `implementer-agent` (issue_number, iteration) | +| `implementer-agent.md` | `workflow_dispatch` | issue_number | iteration (default "1"), pr_number (default "", set by reviewer on kickback) | `reviewer-agent` (issue_number, pr_number, iteration) | +| `reviewer-agent.md` | `workflow_dispatch` | pr_number, issue_number | iteration (default "1") | `implementer-agent` on kickback (iteration+1), else nothing | + +**Input validation**: every workflow fails loudly if required inputs are missing or unresolved (i.e. still contain literal `${{ github.event.inputs.* }}` template syntax, which happens when GitHub doesn't propagate `workflow_dispatch` inputs correctly). The implementer treats a blank or unresolved `pr_number` as "first implementation attempt" — this is expected on the first dispatch from the planner. ## Install @@ -104,14 +106,17 @@ Then apply the OAuth token tweak to each `.lock.yml` per [`skills/install-workfl 1. Open an issue describing what you want built. 2. Add the single label `agent-team`. 3. Watch the thread. Each role posts its contribution as a comment; the implementer opens a draft PR that closes the issue when merged. -4. Human override at any time: add `state:blocked` to halt, edit a comment to steer the next agent, or manually `gh workflow run` a specific role to retry a stuck stage. Manual dispatches must pass the required `workflow_dispatch` inputs, and the downstream workflow markdown must read them via `${{ github.event.inputs.* }}`. +4. Human override at any time: add `state:blocked` to halt, edit a comment to steer the next agent, or manually `gh workflow run` a specific role to retry a stuck stage. When manually dispatching, pass the required inputs for that role: + - planner: `--field issue_number=` + - implementer: `--field issue_number=` (omit `pr_number` for a fresh branch, or `--field pr_number=

` to push to an existing PR) + - reviewer: `--field pr_number=

--field issue_number=` 5. **Retrying a blocked task**: clear `state:blocked`, then re-add `agent-team`. Spec-agent treats it as a fresh dispatch (because the state:* labels are gone and the spec markers are already satisfied — actually: to redo from scratch, also delete the prior spec comment). ## Limits and gotchas - **Concurrency**: each workflow uses `concurrency: group: agent-team-issue-${issue_number}` so only one role runs at a time per issue. - **Max iterations**: default 3 (reviewer kickback → implementer). The counter lives on the `iteration` input passed through the dispatch chain, bumped exclusively by the reviewer on kickback. -- **Input propagation**: planner / implementer / reviewer must fail loudly if required `workflow_dispatch` inputs are missing. Do not rely on label search or recent-activity inference as a fallback. +- **Input propagation**: planner / implementer / reviewer must fail loudly if required `workflow_dispatch` inputs are missing or still appear as unresolved template literals. `pr_number` is the one intentionally optional input — the implementer treats blank or unresolved as "first impl attempt" (creates a new PR branch). Do not rely on label search or recent-activity inference as a fallback for any input. - **Non-UI only**: no screenshot capture. Reviewer validates via tests/CI status + reading the diff. - **Cost**: a single task can easily spend 4× the tokens of a monolithic workflow. Set `timeout-minutes` conservatively and monitor the first few runs. - **No auto-merge**: the reviewer approves but never merges. Humans merge.