Skip to content

feat: add Glassworm supply-chain byte-level scanner#74

Draft
ignaciosantise wants to merge 3 commits intomasterfrom
glassworm-scanner
Draft

feat: add Glassworm supply-chain byte-level scanner#74
ignaciosantise wants to merge 3 commits intomasterfrom
glassworm-scanner

Conversation

@ignaciosantise
Copy link
Copy Markdown

@ignaciosantise ignaciosantise commented Mar 25, 2026

Summary

  • Adds glassworm-check/ composite action — a deterministic byte-level scanner for the Glassworm supply-chain attack campaign
  • Detects invisible PUA Unicode, zero-width characters, mid-file BOMs, eval decoder patterns, suspicious install hooks, and byte-count anomalies
  • Multi-ecosystem install hook detection: npm (preinstall/postinstall), Rust (build.rs), CocoaPods (script_phase/prepare_command), Gradle (buildscript injection), Python (setup.py inline code), Go (go:generate directives)
  • Runs on changed files only via git diff, two-tier severity (critical vs warning), posts/updates PR comment on critical findings

Changes from initial draft

Addresses all 5 review findings:

  • Removed external aikido.dev URL from report output
  • Added base-ref input instead of hardcoded main fallback
  • Fixed dead if logic in zero-width character check
  • Added PR comment deduplication (edit existing comment instead of spamming)
  • Fixed filename word-splitting with mapfile -t

Extended install-hook detection beyond npm to 6 ecosystems (Rust, CocoaPods, Gradle, Python, Go).

Test plan

  • Test with a file containing invisible PUA Unicode characters → should fail with CRITICAL
  • Test with a clean PR → should pass silently
  • Test with package.json adding a postinstall hook → should report WARNING
  • Test with build.rs addition → should report WARNING
  • Test with *.podspec containing script_phase → should report WARNING
  • Test with fail-on-warning: true → warnings should cause failure
  • Verify PR comment is updated (not duplicated) on re-run

🤖 Generated with Claude Code

Deterministic scanner that runs before code review to catch invisible
Unicode obfuscation, malicious install hooks, eval-based payload
decoders, and byte-count anomalies associated with the Glassworm
supply-chain attack campaign.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Mar 25, 2026

Claude finished @ignaciosantise's task in 3s —— View job


Review

  • Gather context
  • Read changed file
  • Spawn breaking changes subagent → no issues (purely additive)
  • Spawn data classification subagent → no issues
  • Compile findings
Found 5 issue(s)

Issue 1: External domain URL in report

ID: action-external-domain-a3c1
File: glassworm-check/action.yml:158
Severity: MEDIUM
Category: security

Context:

  • Pattern: Hard-coded https://www.aikido.dev/blog/... URL embedded in the report output written to $GITHUB_STEP_SUMMARY and PR comments
  • Risk: Policy violation — external domains in action code are restricted to reown.com, walletconnect.com, walletconnect.org
  • Impact: If the aikido.dev domain is ever hijacked or the URL redirects, runners and users follow it; also a policy audit finding
  • Trigger: Any run that finds Glassworm indicators

Recommendation: Replace with an internal wiki/docs link, or remove the attribution URL entirely and keep just the inline label text:

echo "This PR contains patterns associated with the Glassworm supply-chain attack campaign."

Issue 2: Hardcoded main fallback for base branch

ID: action-base-ref-fallback-9b2d
File: glassworm-check/action.yml:44
Severity: MEDIUM
Category: code_quality

Context:

  • Pattern: BASE_REF="${GITHUB_BASE_REF:-main}" — if GITHUB_BASE_REF is unset (push events, manual workflow_dispatch, direct checkout), the scanner diffs against origin/main
  • Risk: Repos whose default branch is master or any other name will either diff against the wrong branch or silently fall back to HEAD~1...HEAD (line 45 fallback). The fallback only covers HEAD~1, so a multi-commit push would under-report changed files.
  • Impact: Files changed in a PR may not be scanned; the action appears to pass when it should report findings
  • Trigger: Any workflow not triggered by pull_request / pull_request_target events

Recommendation: Add an explicit base-ref input, and use it in the diff:

inputs:
  base-ref:
    description: "Base ref to diff against"
    required: false
    default: ${{ github.base_ref || github.event.repository.default_branch }}

Issue 3: Dead outer if in zero-width character check

ID: action-zwc-dead-if-c4f2
File: glassworm-check/action.yml:85-89
Severity: MEDIUM
Category: code_quality

Context:

  • Pattern: if grep -Pq '...' "$f" 2>/dev/null || true; then — the || true makes this condition always evaluate to true, so the outer if block always runs regardless of whether zero-width characters are present
  • Risk: Every scanned file runs the inner count grep even if no zero-width chars exist; more importantly the logical structure is misleading and any future editor may remove the inner guard assuming the outer one handles the short-circuit
  • Impact: Minor performance overhead; confusing intent

Recommendation: Remove the outer if entirely since the inner check is the real guard:

count=$(grep -Pc '[\x{200B}\x{200C}\x{200D}\x{2060}]' "$f" 2>/dev/null || echo 0)
if [ "$count" -gt 0 ]; then
  REPORT+="⚠️ **WARNING** — Zero-width Unicode characters in \`$f\`\n"
  FOUND_WARNING=1
fi

Issue 4: PR comment posted repeatedly with no deduplication

ID: action-comment-dedup-7e1a
File: glassworm-check/action.yml:184-195
Severity: LOW
Category: code_quality

Context:

  • Pattern: gh pr comment ... --body-file "$REPORT_FILE" creates a new comment every run; on every commit push to a PR that has critical findings, a new "Glassworm Security Alert" comment is appended
  • Risk: Comment spam on long-running PRs; reviewers may miss the signal in the noise
  • Impact: Noisy PR timeline; no functional regression

Recommendation: Use gh pr comment --edit-last or search for and update an existing bot comment before creating a new one. Alternatively, post to the job summary only and skip PR commenting. Fix this →


Issue 5: Unquoted $FILES causes word-splitting on spaces in filenames

ID: action-files-wordsplit-2d3e
File: glassworm-check/action.yml:49
Severity: LOW
Category: code_quality

Context:

  • Pattern: for f in $FILES$FILES is unquoted, so IFS word-splitting applies; any file with spaces in its path will be split into multiple tokens and silently skipped (the [ -f "$f" ] || continue guard will drop the fragments)
  • Risk: Files with spaces in names are not scanned
  • Impact: Scanner silently misses files; low likelihood in practice for JS/TS/JSON files

Recommendation:

mapfile -t CHANGED_FILES < <(git diff --name-only "origin/$BASE_REF"...HEAD 2>/dev/null || git diff --name-only HEAD~1..HEAD)
for f in "${CHANGED_FILES[@]}"; do

- Remove external aikido.dev URL from report output
- Add base-ref input instead of hardcoded main fallback
- Fix dead if logic in zero-width character check
- Add PR comment deduplication (edit existing instead of spamming)
- Fix filename word-splitting with mapfile -t
- Extend install-hook detection to Rust (build.rs), CocoaPods
  (script_phase/prepare_command), Gradle (buildscript/plugins),
  Python (setup.py inline code), and Go (go:generate)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ignaciosantise
Copy link
Copy Markdown
Author

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Mar 26, 2026

Claude finished @ignaciosantise's task in 2m 7s —— View job


Review

  • Gather context
  • Read current state of glassworm-check/action.yml
  • Verify all 5 previously reported issues are resolved
  • Check for any new issues
  • Post incremental review findings

🔄 Incremental Review: Verifying fixes from previous review + checking for new issues

Previously Reported Issues

ID Status
action-external-domain-a3c1 ✅ Resolved — aikido.dev URL removed; inline description only at line 215
action-zwc-dead-if-c4f2 ✅ Resolved — dead outer if removed; count-based guard at lines 96–100
action-comment-dedup-7e1a ✅ Resolved — deduplication via gh api PATCH at lines 252–263
action-files-wordsplit-2d3e ✅ Resolved — mapfile -t FILES used at lines 56 and 132
action-base-ref-fallback-9b2d ⚠️ Partially resolved — see below
Found 2 issue(s)

Issue 1: Hardcoded main fallback still present in bash

ID: action-base-ref-fallback-9b2d
File: glassworm-check/action.yml:54
Severity: MEDIUM
Category: code_quality

Context:

  • Pattern: base-ref input was added (good), but bash resolution still falls back to "main" when neither the input nor GITHUB_BASE_REF is set
  • Risk: Repos whose default branch is master or any non-main name silently diff against origin/main, which may not exist — git diff fails, 2>/dev/null swallows the error, fallback to HEAD~1...HEAD only covers one commit
  • Impact: Multi-commit pushes on non-main-default repos silently under-scan
  • Trigger: Any workflow_dispatch or push event on a repo with a default branch other than main

Recommendation: Move the default to the YAML input so github.event.repository.default_branch is used automatically:

inputs:
  base-ref:
    description: "Base ref to diff against (auto-detected for PRs)"
    required: false
    default: "${{ github.base_ref || github.event.repository.default_branch }}"

Then in bash: BASE_REF="${INPUT_BASE_REF:-main}" becomes unnecessary — the input always has a value.


Issue 2: grep -Pc silently skips zero-width check on macOS runners

ID: action-zwc-pcre-compat-a7f1
File: glassworm-check/action.yml:96
Severity: LOW
Category: code_quality

Context:

  • Pattern: grep -Pc '[\x{200B}...]' requires PCRE (-P). macOS ships BSD grep without PCRE support; the flag is unrecognized and grep exits non-zero
  • Risk: The 2>/dev/null || echo 0 catch means count is always 0 on macOS — zero-width character detection is silently disabled on macos-* runners
  • Impact: Zero-width Unicode chars go undetected on macOS-hosted jobs
  • Trigger: Any workflow using a macos-* runner

Recommendation: Use LC_ALL=C grep -c with hex escapes, or detect and branch:

count=$(LC_ALL=C grep -cP '[\x{200B}\x{200C}\x{200D}\x{2060}]' "$f" 2>/dev/null \
  || python3 -c "import re,sys; print(len(re.findall(r'[\u200b\u200c\u200d\u2060]', open('$f').read())))" 2>/dev/null \
  || echo 0)

- Move base-ref default to YAML input using github.base_ref ||
  github.event.repository.default_branch, eliminating hardcoded
  "main" fallback in bash
- Add python3 fallback for zero-width char detection when grep -Pc
  (PCRE) is unavailable on macOS runners

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant