Skip to content

feat!: drop detect-secrets; ship tuned betterleaks default config#118

Merged
theagenticguy merged 1 commit into
mainfrom
chore/scanners-dedup-and-tune
May 16, 2026
Merged

feat!: drop detect-secrets; ship tuned betterleaks default config#118
theagenticguy merged 1 commit into
mainfrom
chore/scanners-dedup-and-tune

Conversation

@theagenticguy
Copy link
Copy Markdown
Owner

Summary

  • Removed detect-secrets entirely. Wrapper, converter, catalog spec, index switch case, P1 list, tests, docs, README rows, pre-release-gate workflow step, and the in-tree .secrets.baseline file.
  • Fixed the betterleaks wrapper. Two real bugs: --report-path=/dev/stdout fails ENXIO inside Node's execFile (pipe ≠ char device) → switched to --report-path=-. And git --pre-commit=false walks the entire git log → switched to dir mode for working-tree state.
  • Shipped a tuned default config at packages/scanners/config/betterleaks.default.toml. Extends the upstream 276 rules; allowlists vendored deps (node_modules, .venv, vendor, Pods), build outputs (dist, build, target, .next, coverage), lockfiles, SBOMs, binary blobs, test files. Auto-injected by the wrapper unless the user has their own betterleaks.toml / .gitleaks.toml at the project root.
  • Updated the pre-release CI gate to run betterleaks dir with the same vendored config locally and in CI.
  • ADR 0017 documents the rationale, coverage audit, and migration.

Why

codehub analyze had two parallel secret scanners. detect-secrets was the long pole (5+ min Python walker, sometimes timed out at 300s) and betterleaks was effectively dead — the wrapper produced empty SARIF for everyone. Together they emitted 18,893 findings, almost all noise from generic-entropy matchers hitting pnpm-lock.yaml integrity hashes and .cdx.json SBOMs.

Coverage audit (Context7 + DeepWiki + upstream betterleaks.toml) confirmed betterleaks is a strict superset for the OCH threat model:

  • 276 default rules vs detect-secrets' ~24
  • CEL-filtered generic-api-key subsumes high-entropy + keyword detection
  • Aho-Corasick prefilter + RE2 = order-of-magnitude faster
  • Only loss: IPPublicDetector (low-value, high-FP)

Measured on OCH self-scan

Metric Before After
Wall clock (codehub analyze .) 12:39 5:35
Total findings 18,893 45
Betterleaks findings 0 (broken, ENXIO) 0 (clean)
Scanner inventory 20 19

The remaining 45 are all signal: 26 grype CVEs, 12 vulture dead-code, 3 ruff lint, 3 radon complexity, 1 biome.

Test plan

  • mise run check exit 0 (lint + typecheck + banned-strings + build + test)
  • Workspace tests: 1931+ pass / 0 fail
  • Scanner package: 81 pass / 0 fail (was 80 + 4 detect-secrets-specific)
  • End-to-end: codehub analyze . runs clean on 5:35; betterleaks emits zero findings on the OCH repo
  • Reviewer: install betterleaks (brew install betterleaks or download from github.com/betterleaks/betterleaks) and run codehub analyze against your project — confirm the vendored config behaves sensibly. Drop a betterleaks.toml at the project root to override.

Migration

Users with .secrets.baseline files should delete them (no longer consumed). Project-level overrides go in betterleaks.toml at the project root, which the wrapper picks up via betterleaks' native config-precedence and skips the --config injection.

🤖 Generated with Claude Code

`codehub analyze` ran two parallel secret scanners. detect-secrets was
the long pole (5+ minute Python walker, sometimes timed out at 300s) and
betterleaks was effectively dead — the wrapper passed
`--report-path=/dev/stdout`, which fails inside Node's `execFile` with
ENXIO because the child's fd 1 is a pipe. Together they emitted 18,893
findings on the OCH self-scan, almost all noise from generic-entropy
matchers flagging integrity hashes in lockfiles and SBOMs.

Coverage audit (Context7 + DeepWiki + upstream `betterleaks.toml`)
confirmed betterleaks ships 276 default rules vs detect-secrets' ~24,
including a CEL-filtered `generic-api-key` catch-all that subsumes the
older tool's high-entropy + keyword detectors. Only `IPPublicDetector`
(low-value, high-FP) is uniquely detect-secrets — not worth keeping the
Python dep for.

What changed:
- Removed detect-secrets entirely: wrapper, converter, catalog spec,
  index switch case, P1 list, tests, README rows, docs ADR refs,
  pre-release-gate workflow step, in-tree `.secrets.baseline`.
- Fixed betterleaks wrapper: `--report-path=-` instead of `/dev/stdout`,
  always uses `dir` mode (working-tree state, not git history),
  auto-detects user `betterleaks.toml`/`gitleaks.toml` and only injects
  the vendored default config when none is present.
- Shipped `packages/scanners/config/betterleaks.default.toml` —
  `[extend] useDefault = true` plus `[[allowlists]]` blocks that filter
  vendored deps, build outputs, lockfiles, SBOMs, binary blobs, and
  test files via RE2 path regexes.
- Pre-release gate now runs `betterleaks dir` with the same config
  the wrapper injects locally, with `--exit-code=1`.

Measured on the OCH self-scan:
- Wall clock: 12:39 → 5:35 (-56%)
- Findings: 18,893 → 45 (-420x)
- Betterleaks: 0 (broken, ENXIO) → 0 (clean, tuned config holds)

ADR 0017 records the rationale and migration. Users override the
shipped config by dropping a `betterleaks.toml` at the project root;
the wrapper picks it up via betterleaks' native config-precedence and
skips the `--config` injection.
@theagenticguy theagenticguy merged commit d370f9e into main May 16, 2026
36 of 40 checks passed
@theagenticguy theagenticguy deleted the chore/scanners-dedup-and-tune branch May 16, 2026 23:18
@github-actions github-actions Bot mentioned this pull request May 16, 2026
theagenticguy added a commit that referenced this pull request May 16, 2026
…crets) (#119)

## Summary

The `pre-release-gate` aggregator job's `needs:` block was renamed from
`detect-secrets` to `betterleaks` in PR #118 in the new step but missed
the dependency reference further down the file. The release-please PR
(#116) is `BLOCKED` because the workflow won't parse — `Pre-release gate
(aggregate)` depends on a job that no longer exists, so GitHub Actions
reports "workflow file issue" and skips every required check.

## Fix

One-line update to `.github/workflows/pre-release-gate.yml`: aggregator
`needs:` references `betterleaks` instead of `detect-secrets`. Also
fixes a stale comment near the top.

## Test plan

- [x] `mise run check` exit 0 (pushed via lefthook pre-push)
- [ ] CI on this PR confirms the aggregator parses and runs to
completion

🤖 Generated with [Claude Code](https://claude.com/claude-code)
theagenticguy added a commit that referenced this pull request May 16, 2026
#120)

## Summary

The new `betterleaks full sweep` step in `pre-release-gate.yml` (added
in #118) uses the `betterleaks` binary, but it isn't on the GitHub
runner's PATH unless `mise` installs it. Job exited 127 (`command not
found`) on PR #116.

## Fix

Add `"aqua:betterleaks/betterleaks" = "1.2.0"` to `[tools]` in
`mise.toml`. `mise-action` in the workflow already provisions every
entry of `[tools]`, so the binary lands on PATH automatically. Also
benefits `codehub analyze` self-scan runs — they previously skipped
betterleaks with a "binary not found" warning; now they'll actually run
it.

## Test plan

- [x] `mise install` resolves the `aqua:` source locally (already had it
via global config)
- [ ] CI on this PR confirms mise-action installs betterleaks and the
pre-release gate sweep step runs

🤖 Generated with [Claude Code](https://claude.com/claude-code)
theagenticguy added a commit that referenced this pull request May 16, 2026
🤖 Automated release via release-please
---


<details><summary>analysis: 0.3.0</summary>

##
[0.3.0](analysis-v0.2.0...analysis-v0.3.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))

### Features

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))
([49e14fd](49e14fd))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.0
    * @opencodehub/wiki bumped to 0.2.0
</details>

<details><summary>cli: 0.5.0</summary>

##
[0.5.0](cli-v0.4.0...cli-v0.5.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* drop detect-secrets; ship tuned betterleaks default config
([#118](#118))
* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))

### Features

* drop detect-secrets; ship tuned betterleaks default config
([#118](#118))
([d370f9e](d370f9e))
* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))
([49e14fd](49e14fd))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.0
    * @opencodehub/ingestion bumped to 0.4.1
    * @opencodehub/mcp bumped to 0.4.0
    * @opencodehub/pack bumped to 0.2.0
    * @opencodehub/scanners bumped to 0.2.0
    * @opencodehub/search bumped to 0.2.0
    * @opencodehub/storage bumped to 0.2.0
    * @opencodehub/wiki bumped to 0.2.0
</details>

<details><summary>cobol-proleap: 0.1.5</summary>

##
[0.1.5](cobol-proleap-v0.1.4...cobol-proleap-v0.1.5)
(2026-05-16)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/ingestion bumped to 0.4.1
</details>

<details><summary>ingestion: 0.4.1</summary>

##
[0.4.1](ingestion-v0.4.0...ingestion-v0.4.1)
(2026-05-16)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.0
    * @opencodehub/scip-ingest bumped to 0.2.2
    * @opencodehub/storage bumped to 0.2.0
</details>

<details><summary>mcp: 0.4.0</summary>

##
[0.4.0](mcp-v0.3.2...mcp-v0.4.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))

### Features

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))
([49e14fd](49e14fd))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.0
    * @opencodehub/pack bumped to 0.2.0
    * @opencodehub/scanners bumped to 0.2.0
    * @opencodehub/search bumped to 0.2.0
    * @opencodehub/storage bumped to 0.2.0
</details>

<details><summary>pack: 0.2.0</summary>

##
[0.2.0](pack-v0.1.4...pack-v0.2.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))

### Features

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))
([49e14fd](49e14fd))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.0
    * @opencodehub/ingestion bumped to 0.4.1
    * @opencodehub/storage bumped to 0.2.0
</details>

<details><summary>scanners: 0.2.0</summary>

##
[0.2.0](scanners-v0.1.2...scanners-v0.2.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* drop detect-secrets; ship tuned betterleaks default config
([#118](#118))

### Features

* drop detect-secrets; ship tuned betterleaks default config
([#118](#118))
([d370f9e](d370f9e))
</details>

<details><summary>scip-ingest: 0.2.2</summary>

##
[0.2.2](scip-ingest-v0.2.1...scip-ingest-v0.2.2)
(2026-05-16)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.0
</details>

<details><summary>search: 0.2.0</summary>

##
[0.2.0](search-v0.1.2...search-v0.2.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))

### Features

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))
([49e14fd](49e14fd))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.0
</details>

<details><summary>storage: 0.2.0</summary>

##
[0.2.0](storage-v0.1.2...storage-v0.2.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))

### Features

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))
([49e14fd](49e14fd))
</details>

<details><summary>wiki: 0.2.0</summary>

##
[0.2.0](wiki-v0.1.1...wiki-v0.2.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))

### Features

* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))
([49e14fd](49e14fd))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.0
</details>

<details><summary>root: 0.6.0</summary>

##
[0.6.0](root-v0.5.0...root-v0.6.0)
(2026-05-16)


### ⚠ BREAKING CHANGES

* drop detect-secrets; ship tuned betterleaks default config
([#118](#118))
* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))

### Features

* drop detect-secrets; ship tuned betterleaks default config
([#118](#118))
([d370f9e](d370f9e))
* lbug-only graph backend; rip DuckDB graph adapter
([#117](#117))
([49e14fd](49e14fd))


### Bug Fixes

* **ci:** grant id-token: write at release-please.yml top level
([#115](#115))
([a87a6eb](a87a6eb))
* **ci:** install betterleaks via mise so the pre-release gate finds it
([#120](#120))
([522a4ec](522a4ec))
* **ci:** pre-release gate aggregator needs betterleaks (was
detect-secrets)
([#119](#119))
([a6f3448](a6f3448))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Laith Al-Saadoon <alsaadoonlaith@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant