Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
e323423
Add autosolve actions for automated issue resolution
fantapop Mar 27, 2026
78d1dcc
Address PR review feedback (batch 1)
fantapop Apr 3, 2026
b02a463
Replace credential helper with GIT_ASKPASS for fork authentication
fantapop Apr 3, 2026
11b3ba2
Fix marker extraction to use last occurrence in output
fantapop Apr 3, 2026
3fe1297
Fail closed on symlink resolution errors in security check
fantapop Apr 3, 2026
a5502d2
Reset staged changes on all security review failures
fantapop Apr 3, 2026
31b7e13
Add unit tests for pure helpers and require PR body file
fantapop Apr 3, 2026
599c7a2
Validate boolean inputs with case-insensitive parsing
fantapop Apr 3, 2026
c91aab2
Simplify Claude Runner interface and remove dead code
fantapop Apr 3, 2026
45e2c78
Propagate errors instead of swallowing them
fantapop Apr 3, 2026
226b573
Mitigate prompt injection in AI security review
fantapop Apr 6, 2026
6406103
Remove additional_instructions input
fantapop Apr 6, 2026
96bc22e
Harden prompt injection defenses with context_vars and env filtering
fantapop Apr 7, 2026
01b51e1
Always block .github/ in blocked paths
fantapop Apr 7, 2026
0fa1d12
Fix action.yml issues found during integration testing
fantapop Apr 7, 2026
09ccb21
Prevent sensitive data in Claude output logs
fantapop Apr 8, 2026
7d542de
autosolve: Pretty-print JSON in collapsible log output
fantapop Apr 8, 2026
34ec73c
autosolve: Allow assess to read context_vars via printenv
fantapop Apr 8, 2026
4bc5058
autosolve: Default pr_base_branch to main and remove SymbolicRef
fantapop Apr 9, 2026
de8989b
autosolve: Disable Go module caching in setup-go
fantapop Apr 9, 2026
555a3c5
autosolve: Exit non-zero when implementation fails
fantapop Apr 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- run: ./test.sh
- uses: actions/setup-go@v6
with:
go-version-file: autosolve/go.mod
- name: Run shell tests
run: ./test.sh
- name: Run Go tests
run: cd autosolve && go test ./... -count=1
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,11 @@ Breaking changes are prefixed with "Breaking Change: ".
- `autotag-from-changelog` now exposes `tag_created` and `tag` outputs so
callers can react to whether a new tag was pushed.
- `expect_step_output` test helper for asserting GitHub Actions step outputs.
- `autosolve/assess` action: evaluate tasks for automated resolution suitability
using Claude in read-only mode.
- `autosolve/implement` action: autonomously implement solutions, validate
security, push to fork, and create PRs using Claude. Includes AI security
review, token usage tracking, and per-file batched diff analysis.
Comment on lines +39 to +43
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be moved to the top/unreleased section

- `get-workflow-ref` action: resolve the ref a caller used to invoke a reusable
workflow by parsing the caller's workflow file — no API calls or extra
permissions needed.
Expand Down
11 changes: 11 additions & 0 deletions autosolve/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.PHONY: build test clean

# Local dev binary
build:
go build -o autosolve ./cmd/autosolve

test:
go test ./... -count=1

clean:
rm -f autosolve
108 changes: 108 additions & 0 deletions autosolve/assess/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
name: Autosolve Assess
description: Run Claude in read-only mode to assess whether a task is suitable for automated resolution.

inputs:
claude_cli_version:
description: "Claude CLI version to install (e.g. '2.1.79' or 'latest')."
required: false
default: "2.1.79"
system_prompt:
description: >
Trusted instructions for Claude describing the task to assess.
Do not embed untrusted user input (e.g., issue titles/bodies) here.
Pass user-supplied data via environment variables and list them in context_vars.
required: false
default: ""
skill:
description: Path to a skill/prompt file relative to the repo root.
required: false
default: ""
context_vars:
description: >
Comma-separated list of environment variable names to pass through to Claude.
Use this to provide untrusted user input (e.g., issue titles/bodies) safely.
Claude is automatically told which variables are available and instructed to
read them — you do not need to reference them in system_prompt.
Claude will only have access to these variables plus a baseline set of
system and authentication variables (PATH, HOME, etc.).
required: false
default: ""
assessment_criteria:
description: Custom criteria for the assessment. If not provided, uses default criteria.
required: false
default: ""
model:
description: Claude model ID.
required: false
default: "claude-opus-4-6"
blocked_paths:
description: >
Comma-separated path prefixes that cannot be modified.
.github/ is always blocked and cannot be removed.
required: false
default: ".github/workflows/"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be .github/? (Same question about implement/action.yml)

verbose_logging:
description: >
Log full Claude output in collapsible groups in the step log.
Logs may contain source code snippets, environment variable
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to log env variables? Would they ever contain sensitive info. (Same question about implement/action.yml)

values, or other repository content quoted in Claude's responses.
Security review output is never logged regardless of this setting.
required: false
default: "false"
working_directory:
description: Directory to run in (relative to workspace root). Defaults to workspace root.
required: false
default: "."

outputs:
assessment:
description: PROCEED or SKIP
value: ${{ steps.assess.outputs.assessment }}
summary:
description: Human-readable assessment reasoning.
value: ${{ steps.assess.outputs.summary }}
result:
description: Full Claude result text.
value: ${{ steps.assess.outputs.result }}

runs:
using: "composite"
steps:
- name: Set up Claude CLI
shell: bash
run: |
if command -v roachdev >/dev/null; then
printf '#!/bin/sh\nexec roachdev claude -- "$@"\n' > /usr/local/bin/claude
chmod +x /usr/local/bin/claude
echo "Claude CLI: using roachdev wrapper"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be nice to log the version used similar to the basic claude equivalent below. (Same for implement/action.yml)

else
curl --fail --silent --show-error --location https://claude.ai/install.sh | bash -s -- "$CLAUDE_CLI_VERSION"
echo "Claude CLI installed: $(claude --version)"
fi
env:
CLAUDE_CLI_VERSION: ${{ inputs.claude_cli_version }}

- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: ${{ github.action_path }}/../go.mod
cache: false

- name: Build autosolve
shell: bash
run: go build -trimpath -o "$RUNNER_TEMP/autosolve" ./cmd/autosolve
working-directory: ${{ github.action_path }}/..
Comment on lines +85 to +94
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description states a precompiled Go binary means no Go toolchain is needed at runtime, but this composite action always sets up Go and builds from source. Either add the same precompiled-binary fast-path used in autosolve/implement (skip Go setup/build when $RUNNER_TEMP/autosolve already exists) or adjust the documentation/description to match actual behavior.

Copilot uses AI. Check for mistakes.

- name: Run assessment
id: assess
shell: bash
working-directory: ${{ inputs.working_directory }}
run: $RUNNER_TEMP/autosolve assess
env:
INPUT_SYSTEM_PROMPT: ${{ inputs.system_prompt }}
INPUT_SKILL: ${{ inputs.skill }}
INPUT_CONTEXT_VARS: ${{ inputs.context_vars }}
INPUT_ASSESSMENT_CRITERIA: ${{ inputs.assessment_criteria }}
INPUT_MODEL: ${{ inputs.model }}
INPUT_BLOCKED_PATHS: ${{ inputs.blocked_paths }}
INPUT_VERBOSE_LOGGING: ${{ inputs.verbose_logging }}
98 changes: 98 additions & 0 deletions autosolve/cmd/autosolve/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
package main

import (
"context"
"fmt"
"os"
"os/signal"

"github.com/cockroachdb/actions/autosolve/internal/action"
"github.com/cockroachdb/actions/autosolve/internal/assess"
"github.com/cockroachdb/actions/autosolve/internal/claude"
"github.com/cockroachdb/actions/autosolve/internal/config"
"github.com/cockroachdb/actions/autosolve/internal/git"
"github.com/cockroachdb/actions/autosolve/internal/github"
"github.com/cockroachdb/actions/autosolve/internal/implement"
)

const usage = `Usage: autosolve <command>

Commands:
assess Run assessment phase
implement Run implementation phase
`

func main() {
ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt)
defer cancel()

if len(os.Args) < 2 {
fatalf(usage)
}

var err error
switch os.Args[1] {
case "assess":
err = runAssess(ctx)
case "implement":
err = runImplement(ctx)
default:
fatalf("unknown command: %s\n\n%s", os.Args[1], usage)
}

if err != nil {
action.LogError(err.Error())
os.Exit(1)
}
}

func fatalf(format string, args ...any) {
fmt.Fprintf(os.Stderr, format+"\n", args...)
os.Exit(1)
}

func runAssess(ctx context.Context) error {
cfg, err := config.LoadAssessConfig()
if err != nil {
return err
}
if err := config.ValidateAuth(); err != nil {
return err
}
tmpDir, err := ensureTmpDir()
if err != nil {
return err
}
return assess.Run(ctx, cfg, &claude.CLIRunner{}, tmpDir)
}

func runImplement(ctx context.Context) error {
cfg, err := config.LoadImplementConfig()
if err != nil {
return err
}
if err := config.ValidateAuth(); err != nil {
return err
}
tmpDir, err := ensureTmpDir()
if err != nil {
return err
}

gitClient := &git.CLIClient{}
ghClient := &github.GithubClient{Token: cfg.PRCreateToken}
return implement.Run(ctx, cfg, &claude.CLIRunner{}, ghClient, gitClient, tmpDir)
}

func ensureTmpDir() (string, error) {
dir := os.Getenv("AUTOSOLVE_TMPDIR")
if dir != "" {
return dir, nil
}
dir, err := os.MkdirTemp("", "autosolve_*")
if err != nil {
return "", fmt.Errorf("creating temp dir: %w", err)
}
os.Setenv("AUTOSOLVE_TMPDIR", dir)
return dir, nil
}
3 changes: 3 additions & 0 deletions autosolve/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
module github.com/cockroachdb/actions/autosolve

go 1.23.8
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The go directive typically uses major.minor (e.g., go 1.23) to indicate the language version. Using a patch version here (1.23.8) may be rejected by some Go toolchains and isn’t the usual way to pin a specific toolchain; if you need to pin, consider using a toolchain go1.23.8 directive instead.

Suggested change
go 1.23.8
go 1.23

Copilot uses AI. Check for mistakes.
Empty file added autosolve/go.sum
Empty file.
Loading
Loading