Add CodeCharta + DependaCharta reusable workflow#105
Open
peuf0u wants to merge 20 commits into
Open
Conversation
Introduces a CodeCharta + DependaCharta integration as a reusable
workflow + two composite actions, so consumer repos drop in a ~20-line
workflow file instead of vendoring the full bash pipeline.
.github/actions/universal-codecharta-snapshot/
Runs ccsh + tokei + gitlogparser + DependaCharta and merges the
outputs into ${project_name}.cc.json.gz (+ .cg.json). Supports
single snapshot (HEAD) or history (N recent merges). Stack profile
enum (swift/kotlin/kmp/ft-fullstack/multi) maps to default ccsh
-fe= and tokei --type filters; both overridable.
.github/actions/universal-codecharta-publish/
Publishes artifacts to the data repo (default
futuredapp/codebase-architectures). Modes: preview, history,
bulk-history, delete-preview, set-latest. Idempotent on
bulk-history; retry-on-conflict push.
.github/workflows/universal-codecharta.yml
Reusable workflow orchestrating three jobs (generate / cleanup /
backfill) with hardcoded org constants. PR previews,
develop/main history snapshots with optional PR-title meta
sidecars, and workflow_dispatch backfill.
Internal action refs pinned to @ci/codecharta-actions for smoke
testing; bump via tools/bump-action-refs.sh before tagging a release.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three changes bundled, all on the reusable workflow:
1. Heavy commenting: top-of-file walk-through of the four scenarios
(PR preview, PR cleanup, develop history, manual backfill) and per-
job / per-step "what happens here, why" comments throughout.
2. actionlint SC2129: collapse multi-line `>> $GITHUB_OUTPUT` redirects
into single `{ ... } >> "$GITHUB_OUTPUT"` blocks in the
"Determine target identifiers" step.
3. actionlint action-input fix: peter-evans/create-or-update-comment@v4
doesn't accept `comment-author` / `body-includes`. The previous step
silently ignored them and stacked a new comment on every push. Split
into find-comment + create-or-update-comment with `comment-id`, the
canonical idiom — updates the existing bot comment instead of
duplicating.
4. Flip internal @ci/codecharta-actions refs → @main so the
bump-action-refs / validate-action-refs bats tests pass. Smoke
testing is complete; the dev-branch refs are no longer needed and
they break the project's bump/validate symmetry tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
validate-action-refs.sh greps for `futuredapp/.github/` line-by-line and flags any line where the ref isn't @<version>. The walk-through comment contained a literal `uses: futuredapp/.github/.../universal-codecharta.yml@<ref>` example which the validator counted as a mismatched ref. Rephrased the example so it no longer matches the validator's pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The .github repo is public; baking specific (often private) repo names
into the reusable workflow leaks them to anyone reading the source.
Promote `data_repo` and `picker_base_url` to required workflow inputs
so the public file contains no specific repo names, and scrub all
private-repo references from comments and action descriptions.
Consumer YAMLs gain two extra lines:
with:
data_repo: <org>/<data-repo>
picker_base_url: https://...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit (1708e02) inadvertently bumped 16 unrelated android/ios/kmp workflows from @2.3.3 to @main as a side-effect of running bump-action-refs.sh locally during validation. Restoring them to the released @2.3.3 tag they were pinned to before this branch. Net effect: only the four codecharta-related files (workflow + two publish-action files + scrubbed comments) carry intentional changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Temporarily reverting the @main flip so smsticket-ios PR #86 can actually invoke the workflow before this PR merges. Without this, the consumer's `uses: …@ci/codecharta-actions` resolves the workflow file, but the workflow's own `uses: …@main` for the composite actions 404s because those files aren't on main yet. The test_action-refs.bats tests will go red while this commit is on the branch (their bump script can't rewrite non-main / non-semver refs). They'll go green again once flipped back to @main pre-merge. NOT FOR MERGE in its current state — needs a follow-up commit to revert to @main before this PR can land. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a reusable GitHub Actions workflow plus two composite actions to standardize generating CodeCharta + DependaCharta snapshots in consumer repos and publishing those artifacts into a central “data repo” consumed by the picker UI.
Changes:
- Adds a reusable workflow (
universal-codecharta.yml) with jobs for PR previews, history snapshots on pushes, PR cleanup, and manual backfill. - Adds a snapshot composite action that runs CodeCharta analysis (and DependaCharta for HEAD snapshots) and emits artifact paths for downstream steps.
- Adds a publish composite action that clones the data repo, copies artifacts into
projects/<name>/..., rebuilds the manifest, commits, and pushes with retry logic.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/universal-codecharta.yml |
Reusable workflow orchestration for preview/history/backfill + PR comments. |
.github/actions/universal-codecharta-snapshot/action.yml |
Composite action definition for running snapshot/history analysis. |
.github/actions/universal-codecharta-snapshot/codecharta.sh |
Shell implementation of the analysis pipeline (HEAD snapshot + merge-history mode). |
.github/actions/universal-codecharta-publish/action.yml |
Composite action definition for publishing/deleting snapshots in the data repo. |
.github/actions/universal-codecharta-publish/codecharta-publish.sh |
Shell implementation for data-repo writes, manifest rebuild, and push retries. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Pre-merge smoke test passed (smsticket-ios run 25672470737 produced the expected data-repo commit and PR comment). Flipping the internal composite-action refs back to @main so the test_action-refs.bats tests pass and PR is mergeable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
smsticket-android consumer about to open a PR; needs the actions resolvable against the branch since they don't exist on main yet. Action-refs bats tests will go red until flipped back to @main (post smoke-test, pre-merge). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eight fixes from the Copilot review: 1. Fix double URL-encoding in PR comment links. Inner API URLs no longer encode the path, so the outer @uri wrapping is the only encoding step. Picker links should now decode correctly. 2. Anchor the push-event PR-number regex to "^Merge pull request #NNN". Previously any commit subject containing "#NNN" (e.g. "Fix #123") would be misclassified as a PR merge. 3. Replace the retry-on-conflict's git pull --rebase with a hard reset + reapply pattern. Under `set -e`, a rebase conflict on manifest.json aborted the script; the new flow tolerates any concurrent change because apply_change is idempotent. 4. Refresh latest.cg.json / latest.meta.json correctly. set-latest now `rm -f`s the existing latest sidecar when the newest entry lacks one, instead of leaving a stale pointer to an older entry. 5. Switch git log field separator from `|` to ASCII unit separator (\x1f). Pipe is legal in commit subjects and would split records incorrectly when present. 6. Skip the DependaCharta image pull in history/backfill mode. That mode doesn't run DependaCharta; the pull was wasted bandwidth. 7. Remove dead helper functions (ccsh_fe_arg etc.) from codecharta.sh. The real arg-building logic lives inside the Docker bash -lc heredocs; the host-level helpers were never called. 8. Update action.yml input descriptions referencing the renamed `backfill-history` mode (now `bulk-history`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same collateral as commit 3254b03 — the bump-action-refs.sh test cycle I ran locally to validate the previous fixes left the 16 android/ios/kmp workflow refs flipped to @main. Restoring to @2.3.3. I need to stop running bump-action-refs locally inside this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The picker UI surfaces PR titles by reading `<basename>.meta.json` next to each snapshot (codebase-architectures PR #22 contract). Previously we only generated this sidecar for history snapshots (push to develop/ main with a Merge-pull-request commit), so: - History entries: 60/61 sidecars present. - Preview entries: 0/N present. This commit extends the meta-fetch step to run whenever the snapshot is keyed to a PR number — that's every PR preview AND every history merge with a parseable PR number. The publish action's `preview` mode now copies the sidecar alongside `.cc.json.gz` / `.cg.json`, and `delete-preview` cleans it up on PR close. Net effect: every preview gets a `<basename>.meta.json` with {title, author, mergedAt, url}; the picker UI now shows the PR title for in-flight PRs instead of falling back to the legacy `PR #N` label. Rebase-merge history entries (stem `merge-<sha>`, no PR number) still skip the sidecar — there's nothing to query. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. PROJECT_NAME path traversal — validate against [A-Za-z0-9._-]+ before using it in path construction. A malicious value like `../foo` could have caused writes outside the projects/ tree (the script uses it unquoted in `cp` and `rm -f`). 2. DATA_REPO unconstrained input — validate as <owner>/<name> with the same safe charset. Shell quoting prevented command injection, but bounding the value avoids targeting unintended repos by accident. 3. add_dependency_metric_descriptions used `exit` on missing node — but the caller wraps it in `|| true` to make annotation best-effort. Use `return 0` instead so the action keeps going. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five fixes from the latest pass: 1. Tokei multi-language filter was broken — passing the comma-list as a single `--type Kotlin,Java` searches for a literal language named "Kotlin,Java" (nothing matches). Split on comma and emit one `--type LANG` flag per entry. Explains why our stack-profile tokei filtering hasn't been narrowing the LOC views. 2. & 3. Replaced `git diff --quiet` early-exit checks with `git add -A && git diff --cached --quiet`. Untracked files weren't detected by the previous check, so the first publish for a new project (or after the retry-loop reset+reapply) could skip the commit and silently exit. Now any new file participates in the change-detection. 4. & 5. Both `preview` and `history` modes now actively delete the target meta sidecar when SOURCE_META_PATH is missing, instead of leaving the prior run's stale file in place. Same for the `latest.meta.json` pointer in `history` mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backfill produces history entries without .cg.json (DependaCharta isn't run per historical commit). The previous set-latest behavior deleted latest.cg.json whenever the newest entry lacked one, which meant a backfill run wiped out the project's working DependaCharta graph entirely. Same problem for latest.meta.json against rebase-merge entries with no parseable PR number. Switch to "update when available, leave alone otherwise". Stale older data beats no data — the picker's DependaCharta and title displays keep working until a fresh develop push refreshes the pointers. Reverses part of the pass-2 fix (which proactively deleted to avoid mismatch). The mismatch is acceptable here because the cc/cg/meta viewers are separate picker surfaces, not a unified delta query. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four fixes from the pass-4/5 cumulative feedback: 1. PROJECT_NAME could escape with '.' or '..'. The earlier regex [A-Za-z0-9._-]+ accepts both — `PROJECT_NAME=..` would make PROJECT_DIR `projects/..` (the data repo root). Add an explicit reject for the pure-dot cases on top of the charset check. 2. analysis_image default was `:latest` (floating). Pin to `codecharta/codecharta-analysis:1.143.0` (current latest stable on Docker Hub, 2026-04-28) for reproducibility. Bumped both action.yml input default and the codecharta.sh env fallback. 3. pr_limit was `type: string`. Switch to `type: number` so the workflow_call boundary catches non-numeric values before they reach the snapshot script's runtime validation. 4. Workflow header described all snapshots as having complexity/ LOC/churn/dependency edges, but backfill-history snapshots intentionally omit tokei/gitlogparser/DependaCharta (rerunning them per historical commit would blow CI budget). Clarified the "lightweight" backfill caveat inline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pass-5 Copilot suggestion to switch pr_limit from `type: string`
to `type: number` looked correct in isolation but broke the established
consumer pattern. Consumer YAMLs pass
pr_limit: ${{ inputs.pr_limit || '20' }}
which evaluates to a STRING expression. workflow_call's `type: number`
validation rejects it at call time, so the reusable workflow never
starts (failure with zero jobs run). Caught by smsticket-ios's PR #86
run 25721524846 going red with no logs.
Revert to `type: string`. The snapshot script's history mode already
validates the value is a positive integer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pass-3 Copilot suggestion to emit one `--type LANG` per value broke
tokei at runtime:
error: The argument '--type <types>' was provided more than once,
but cannot be used multiple times
Tokei's CLI takes one `--type` flag whose value is a comma-separated
list of languages (`--type Swift,Objective-C`), not repeated flags.
The original code (which Copilot claimed didn't filter correctly) was
actually right.
Two reviewer-loop regressions in a row from over-trusting the bot's
suggestions verbatim — both this commit and 2fa7d15 (pr_limit type)
reverse review-driven changes that broke real consumers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`sep=$'\x1f'` lived inside an outer `bash -lc '…'` single-quoted block.
ANSI-C quoting needs `$'...'` literally; the `'` after `$` was closing
the outer quote, so the inner bash actually saw `$\x1f` (interpreted
as a reference to an unbound `x1f` variable). Under `set -u`, every
backfill run died with:
bash: line 23: x1f: unbound variable
Caught by slovak-telekom-young-kmp PR's backfill workflow (run
25724033946).
Build the separator with `sep=$(printf "\x1f")` instead — printf
interprets `\xNN` regardless of quote style, and the inner DOUBLE
quotes don't conflict with the outer single-quoted heredoc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consumers outside the futuredapp GHCR access boundary (personal accounts, other orgs) can't pull the private fork image. Adding a backwards-compatible input lets them point at upstream maibornwolff/dependacharta-analysis on Docker Hub for non-Swift projects. Inlined the image-ref construction at both call sites and removed the workflow-level env that just aliased a single value; the input expression itself is now the source of truth. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The retry loop does `git reset --hard origin/<branch>` and re-runs
apply_change. If the winning concurrent push only created one of
{previews, history} (e.g. a preview run wins, then a history run loses
and resets), the loser's local directory for the OTHER kind is gone and
the subsequent `cp` fails with "No such file or directory" — killing
the loop on attempt 2 due to set -euo pipefail.
The script's own comment claims apply_change is idempotent; hoisting
mkdir out of the function quietly broke that invariant. Moving it back
inside restores idempotency at no cost (mkdir -p is a no-op when the
directory exists).
Triggered in the wild during a PR-sync + push-to-develop race on a
consumer repo. After the fix, the second-attempt push succeeds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
matejsemancik
approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is CodeCharta + DependaCharta?
CodeCharta renders a codebase as a 3D treemap so reviewers can see file/folder size, complexity, and churn at a glance. DependaCharta extracts the project's dependency graph (which file imports which file) and visualizes it alongside.
The org runs a picker UI backed by a single "data repo" that aggregates these snapshots across every project. To put a project in the picker you have to (a) run the analyzers on every PR + every merge to develop, (b) write the outputs into the data repo's
projects/<name>/...tree, (c) rebuild a manifest the picker reads.That used to be a 280-line
.github/workflows/codecharta.yml+ two bash scripts (~1500 lines) vendored into each consumer repo. As we add more projects (iOS, Android, KMP, full-stack), maintaining N copies of that code in sync drifts fast.What this PR adds
A reusable workflow + two composite actions in this repo, so a consumer's workflow file shrinks to ~20 lines:
.github/workflows/universal-codecharta.ymlgenerate(PR preview or push history),cleanup(delete preview on PR close),backfill(manual: walk N recent merges). Heavy top-of-file comments walk through what each section does..github/actions/universal-codecharta-snapshot/snapshot(HEAD) andhistory(N merges)..github/actions/universal-codecharta-publish/preview,history,bulk-history,delete-preview,set-latest. Idempotent on bulk-history; retry-on-conflict push for concurrent writers.Design decisions worth flagging
Stack profiles. Four named profiles (
swift,kotlin,kmp,ft-fullstack) map to default ccsh-fe=and tokei--typefilters. A fifthmultivalue omits both filters (analyzes everything the tools recognize).multiis the default if no stack is specified. Overrides viafile_extensions/tokei_types/extra_excludesare available for repos that don't fit a named profile.Single DependaCharta image for all stacks. Verified that the Futured fork (
ghcr.io/futuredapp/dependacharta-swift) is upstream + Swift, so it covers Java, Kotlin, TypeScript, Vue, Swift, C++, Python, etc. — no per-stack image map needed.Data repo and picker URL are required inputs (no defaults). This repo is public; baking specific repo names into the source would leak a (potentially private) repo's identity into the public workflow file. Consumers pass these per-repo; the reusable workflow itself is stack-and-repo-agnostic.
Bulk-history mode. Backfill used to produce N commits to the data repo (one per snapshot, one clone+push per commit). New
bulk-historypublish mode does one clone, N file copies, one commit, one push — collapses to 2 commits total per backfill run (one for the entries, one for the latest-pointer refresh).Testing
Smoke-tested end-to-end against a private iOS consumer repo. The three event paths (PR preview, push to develop, manual backfill) all produced the expected data-repo commits and the expected PR comment.
CI on this PR: actionlint (
Lint GitHub Actions workflows) and bats tests (Test Actions) both green.Out of scope
@main. After this PR merges, the standardbump-action-refs.sh <version>flow will tag and update everything.🤖 Generated with Claude Code