Skip to content

Add CodeCharta + DependaCharta reusable workflow#105

Open
peuf0u wants to merge 20 commits into
mainfrom
ci/codecharta-actions
Open

Add CodeCharta + DependaCharta reusable workflow#105
peuf0u wants to merge 20 commits into
mainfrom
ci/codecharta-actions

Conversation

@peuf0u

@peuf0u peuf0u commented May 11, 2026

Copy link
Copy Markdown
Contributor

What is CodeCharta + DependaCharta?

CodeCharta renders a codebase as a 3D treemap so reviewers can see file/folder size, complexity, and churn at a glance. DependaCharta extracts the project's dependency graph (which file imports which file) and visualizes it alongside.

The org runs a picker UI backed by a single "data repo" that aggregates these snapshots across every project. To put a project in the picker you have to (a) run the analyzers on every PR + every merge to develop, (b) write the outputs into the data repo's projects/<name>/... tree, (c) rebuild a manifest the picker reads.

That used to be a 280-line .github/workflows/codecharta.yml + two bash scripts (~1500 lines) vendored into each consumer repo. As we add more projects (iOS, Android, KMP, full-stack), maintaining N copies of that code in sync drifts fast.

What this PR adds

A reusable workflow + two composite actions in this repo, so a consumer's workflow file shrinks to ~20 lines:

jobs:
  snapshot:
    uses: futuredapp/.github/.github/workflows/universal-codecharta.yml@<version>
    with:
      stack: swift   # or kotlin / kmp / ft-fullstack / multi
      data_repo: <org>/<data-repo-name>
      picker_base_url: https://<picker-host>
    secrets:
      CODEBASE_ARCHITECTURES_TOKEN: ${{ secrets.CODEBASE_ARCHITECTURES_TOKEN }}
File Purpose
.github/workflows/universal-codecharta.yml Reusable workflow. Three jobs: generate (PR preview or push history), cleanup (delete preview on PR close), backfill (manual: walk N recent merges). Heavy top-of-file comments walk through what each section does.
.github/actions/universal-codecharta-snapshot/ Composite action wrapping the analysis: ccsh + tokei + gitlogparser + DependaCharta + ccsh merge. Modes: snapshot (HEAD) and history (N merges).
.github/actions/universal-codecharta-publish/ Composite action wrapping data-repo writes. Modes: preview, history, bulk-history, delete-preview, set-latest. Idempotent on bulk-history; retry-on-conflict push for concurrent writers.

Design decisions worth flagging

Stack profiles. Four named profiles (swift, kotlin, kmp, ft-fullstack) map to default ccsh -fe= and tokei --type filters. A fifth multi value omits both filters (analyzes everything the tools recognize). multi is the default if no stack is specified. Overrides via file_extensions / tokei_types / extra_excludes are available for repos that don't fit a named profile.

Single DependaCharta image for all stacks. Verified that the Futured fork (ghcr.io/futuredapp/dependacharta-swift) is upstream + Swift, so it covers Java, Kotlin, TypeScript, Vue, Swift, C++, Python, etc. — no per-stack image map needed.

Data repo and picker URL are required inputs (no defaults). This repo is public; baking specific repo names into the source would leak a (potentially private) repo's identity into the public workflow file. Consumers pass these per-repo; the reusable workflow itself is stack-and-repo-agnostic.

Bulk-history mode. Backfill used to produce N commits to the data repo (one per snapshot, one clone+push per commit). New bulk-history publish mode does one clone, N file copies, one commit, one push — collapses to 2 commits total per backfill run (one for the entries, one for the latest-pointer refresh).

Testing

Smoke-tested end-to-end against a private iOS consumer repo. The three event paths (PR preview, push to develop, manual backfill) all produced the expected data-repo commits and the expected PR comment.

CI on this PR: actionlint (Lint GitHub Actions workflows) and bats tests (Test Actions) both green.

Out of scope

  • The architecture overlay (per-file tier/layer/feature classification baked into the cc.json.gz attributeDescriptors) was dropped from the migration. It was project-specific and adding a portable version is a separate design problem; recoverable from git history if/when needed.
  • Consumer repos currently pin to @main. After this PR merges, the standard bump-action-refs.sh <version> flow will tag and update everything.

🤖 Generated with Claude Code

peuf0u and others added 5 commits May 11, 2026 11:36
Introduces a CodeCharta + DependaCharta integration as a reusable
workflow + two composite actions, so consumer repos drop in a ~20-line
workflow file instead of vendoring the full bash pipeline.

  .github/actions/universal-codecharta-snapshot/
      Runs ccsh + tokei + gitlogparser + DependaCharta and merges the
      outputs into ${project_name}.cc.json.gz (+ .cg.json). Supports
      single snapshot (HEAD) or history (N recent merges). Stack profile
      enum (swift/kotlin/kmp/ft-fullstack/multi) maps to default ccsh
      -fe= and tokei --type filters; both overridable.

  .github/actions/universal-codecharta-publish/
      Publishes artifacts to the data repo (default
      futuredapp/codebase-architectures). Modes: preview, history,
      bulk-history, delete-preview, set-latest. Idempotent on
      bulk-history; retry-on-conflict push.

  .github/workflows/universal-codecharta.yml
      Reusable workflow orchestrating three jobs (generate / cleanup /
      backfill) with hardcoded org constants. PR previews,
      develop/main history snapshots with optional PR-title meta
      sidecars, and workflow_dispatch backfill.

Internal action refs pinned to @ci/codecharta-actions for smoke
testing; bump via tools/bump-action-refs.sh before tagging a release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three changes bundled, all on the reusable workflow:

1. Heavy commenting: top-of-file walk-through of the four scenarios
   (PR preview, PR cleanup, develop history, manual backfill) and per-
   job / per-step "what happens here, why" comments throughout.

2. actionlint SC2129: collapse multi-line `>> $GITHUB_OUTPUT` redirects
   into single `{ ... } >> "$GITHUB_OUTPUT"` blocks in the
   "Determine target identifiers" step.

3. actionlint action-input fix: peter-evans/create-or-update-comment@v4
   doesn't accept `comment-author` / `body-includes`. The previous step
   silently ignored them and stacked a new comment on every push. Split
   into find-comment + create-or-update-comment with `comment-id`, the
   canonical idiom — updates the existing bot comment instead of
   duplicating.

4. Flip internal @ci/codecharta-actions refs → @main so the
   bump-action-refs / validate-action-refs bats tests pass. Smoke
   testing is complete; the dev-branch refs are no longer needed and
   they break the project's bump/validate symmetry tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
validate-action-refs.sh greps for `futuredapp/.github/` line-by-line and
flags any line where the ref isn't @<version>. The walk-through comment
contained a literal `uses: futuredapp/.github/.../universal-codecharta.yml@<ref>`
example which the validator counted as a mismatched ref. Rephrased the
example so it no longer matches the validator's pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The .github repo is public; baking specific (often private) repo names
into the reusable workflow leaks them to anyone reading the source.
Promote `data_repo` and `picker_base_url` to required workflow inputs
so the public file contains no specific repo names, and scrub all
private-repo references from comments and action descriptions.

Consumer YAMLs gain two extra lines:

  with:
    data_repo: <org>/<data-repo>
    picker_base_url: https://...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit (1708e02) inadvertently bumped 16 unrelated
android/ios/kmp workflows from @2.3.3 to @main as a side-effect of
running bump-action-refs.sh locally during validation. Restoring them
to the released @2.3.3 tag they were pinned to before this branch.

Net effect: only the four codecharta-related files (workflow + two
publish-action files + scrubbed comments) carry intentional changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Temporarily reverting the @main flip so smsticket-ios PR #86 can
actually invoke the workflow before this PR merges. Without this,
the consumer's `uses: …@ci/codecharta-actions` resolves the workflow
file, but the workflow's own `uses: …@main` for the composite actions
404s because those files aren't on main yet.

The test_action-refs.bats tests will go red while this commit is on
the branch (their bump script can't rewrite non-main / non-semver
refs). They'll go green again once flipped back to @main pre-merge.

NOT FOR MERGE in its current state — needs a follow-up commit to
revert to @main before this PR can land.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a reusable GitHub Actions workflow plus two composite actions to standardize generating CodeCharta + DependaCharta snapshots in consumer repos and publishing those artifacts into a central “data repo” consumed by the picker UI.

Changes:

  • Adds a reusable workflow (universal-codecharta.yml) with jobs for PR previews, history snapshots on pushes, PR cleanup, and manual backfill.
  • Adds a snapshot composite action that runs CodeCharta analysis (and DependaCharta for HEAD snapshots) and emits artifact paths for downstream steps.
  • Adds a publish composite action that clones the data repo, copies artifacts into projects/<name>/..., rebuilds the manifest, commits, and pushes with retry logic.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
.github/workflows/universal-codecharta.yml Reusable workflow orchestration for preview/history/backfill + PR comments.
.github/actions/universal-codecharta-snapshot/action.yml Composite action definition for running snapshot/history analysis.
.github/actions/universal-codecharta-snapshot/codecharta.sh Shell implementation of the analysis pipeline (HEAD snapshot + merge-history mode).
.github/actions/universal-codecharta-publish/action.yml Composite action definition for publishing/deleting snapshots in the data repo.
.github/actions/universal-codecharta-publish/codecharta-publish.sh Shell implementation for data-repo writes, manifest rebuild, and push retries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/universal-codecharta.yml Outdated
Comment thread .github/workflows/universal-codecharta.yml Outdated
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh Outdated
Comment thread .github/actions/universal-codecharta-publish/action.yml Outdated
Comment thread .github/actions/universal-codecharta-snapshot/action.yml
Comment thread .github/actions/universal-codecharta-snapshot/codecharta.sh Outdated
Comment thread .github/actions/universal-codecharta-snapshot/codecharta.sh Outdated
peuf0u and others added 5 commits May 11, 2026 15:17
Pre-merge smoke test passed (smsticket-ios run 25672470737 produced
the expected data-repo commit and PR comment). Flipping the internal
composite-action refs back to @main so the test_action-refs.bats
tests pass and PR is mergeable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
smsticket-android consumer about to open a PR; needs the actions
resolvable against the branch since they don't exist on main yet.

Action-refs bats tests will go red until flipped back to @main
(post smoke-test, pre-merge).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eight fixes from the Copilot review:

1. Fix double URL-encoding in PR comment links. Inner API URLs no
   longer encode the path, so the outer @uri wrapping is the only
   encoding step. Picker links should now decode correctly.

2. Anchor the push-event PR-number regex to "^Merge pull request
   #NNN". Previously any commit subject containing "#NNN" (e.g.
   "Fix #123") would be misclassified as a PR merge.

3. Replace the retry-on-conflict's git pull --rebase with a hard
   reset + reapply pattern. Under `set -e`, a rebase conflict on
   manifest.json aborted the script; the new flow tolerates any
   concurrent change because apply_change is idempotent.

4. Refresh latest.cg.json / latest.meta.json correctly. set-latest
   now `rm -f`s the existing latest sidecar when the newest entry
   lacks one, instead of leaving a stale pointer to an older entry.

5. Switch git log field separator from `|` to ASCII unit separator
   (\x1f). Pipe is legal in commit subjects and would split records
   incorrectly when present.

6. Skip the DependaCharta image pull in history/backfill mode. That
   mode doesn't run DependaCharta; the pull was wasted bandwidth.

7. Remove dead helper functions (ccsh_fe_arg etc.) from codecharta.sh.
   The real arg-building logic lives inside the Docker bash -lc
   heredocs; the host-level helpers were never called.

8. Update action.yml input descriptions referencing the renamed
   `backfill-history` mode (now `bulk-history`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same collateral as commit 3254b03 — the bump-action-refs.sh test
cycle I ran locally to validate the previous fixes left the 16
android/ios/kmp workflow refs flipped to @main. Restoring to @2.3.3.

I need to stop running bump-action-refs locally inside this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The picker UI surfaces PR titles by reading `<basename>.meta.json` next
to each snapshot (codebase-architectures PR #22 contract). Previously
we only generated this sidecar for history snapshots (push to develop/
main with a Merge-pull-request commit), so:

  - History entries: 60/61 sidecars present.
  - Preview entries: 0/N present.

This commit extends the meta-fetch step to run whenever the snapshot is
keyed to a PR number — that's every PR preview AND every history merge
with a parseable PR number. The publish action's `preview` mode now
copies the sidecar alongside `.cc.json.gz` / `.cg.json`, and
`delete-preview` cleans it up on PR close.

Net effect: every preview gets a `<basename>.meta.json` with
{title, author, mergedAt, url}; the picker UI now shows the PR title
for in-flight PRs instead of falling back to the legacy `PR #N` label.

Rebase-merge history entries (stem `merge-<sha>`, no PR number) still
skip the sidecar — there's nothing to query.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Comment thread .github/workflows/universal-codecharta.yml Outdated
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/actions/universal-codecharta-snapshot/codecharta.sh Outdated
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh
1. PROJECT_NAME path traversal — validate against [A-Za-z0-9._-]+ before
   using it in path construction. A malicious value like `../foo` could
   have caused writes outside the projects/ tree (the script uses it
   unquoted in `cp` and `rm -f`).

2. DATA_REPO unconstrained input — validate as <owner>/<name> with the
   same safe charset. Shell quoting prevented command injection, but
   bounding the value avoids targeting unintended repos by accident.

3. add_dependency_metric_descriptions used `exit` on missing node — but
   the caller wraps it in `|| true` to make annotation best-effort. Use
   `return 0` instead so the action keeps going.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 12 comments.

Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh Outdated
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh Outdated
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh
Comment thread .github/actions/universal-codecharta-snapshot/codecharta.sh
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Five fixes from the latest pass:

1. Tokei multi-language filter was broken — passing the comma-list as
   a single `--type Kotlin,Java` searches for a literal language named
   "Kotlin,Java" (nothing matches). Split on comma and emit one
   `--type LANG` flag per entry. Explains why our stack-profile tokei
   filtering hasn't been narrowing the LOC views.

2. & 3. Replaced `git diff --quiet` early-exit checks with
   `git add -A && git diff --cached --quiet`. Untracked files weren't
   detected by the previous check, so the first publish for a new
   project (or after the retry-loop reset+reapply) could skip the
   commit and silently exit. Now any new file participates in the
   change-detection.

4. & 5. Both `preview` and `history` modes now actively delete the
   target meta sidecar when SOURCE_META_PATH is missing, instead of
   leaving the prior run's stale file in place. Same for the
   `latest.meta.json` pointer in `history` mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 10 comments.

Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh Outdated
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/actions/universal-codecharta-snapshot/action.yml Outdated
Comment thread .github/workflows/universal-codecharta.yml Outdated
Backfill produces history entries without .cg.json (DependaCharta
isn't run per historical commit). The previous set-latest behavior
deleted latest.cg.json whenever the newest entry lacked one, which
meant a backfill run wiped out the project's working DependaCharta
graph entirely. Same problem for latest.meta.json against rebase-merge
entries with no parseable PR number.

Switch to "update when available, leave alone otherwise". Stale older
data beats no data — the picker's DependaCharta and title displays
keep working until a fresh develop push refreshes the pointers.

Reverses part of the pass-2 fix (which proactively deleted to avoid
mismatch). The mismatch is acceptable here because the cc/cg/meta
viewers are separate picker surfaces, not a unified delta query.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 10 comments.

Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/workflows/universal-codecharta.yml
Comment thread .github/actions/universal-codecharta-snapshot/action.yml Outdated
Comment thread .github/actions/universal-codecharta-publish/codecharta-publish.sh Outdated
peuf0u and others added 6 commits May 11, 2026 19:03
Four fixes from the pass-4/5 cumulative feedback:

1. PROJECT_NAME could escape with '.' or '..'. The earlier regex
   [A-Za-z0-9._-]+ accepts both — `PROJECT_NAME=..` would make
   PROJECT_DIR `projects/..` (the data repo root). Add an explicit
   reject for the pure-dot cases on top of the charset check.

2. analysis_image default was `:latest` (floating). Pin to
   `codecharta/codecharta-analysis:1.143.0` (current latest stable
   on Docker Hub, 2026-04-28) for reproducibility. Bumped both
   action.yml input default and the codecharta.sh env fallback.

3. pr_limit was `type: string`. Switch to `type: number` so the
   workflow_call boundary catches non-numeric values before they
   reach the snapshot script's runtime validation.

4. Workflow header described all snapshots as having complexity/
   LOC/churn/dependency edges, but backfill-history snapshots
   intentionally omit tokei/gitlogparser/DependaCharta (rerunning
   them per historical commit would blow CI budget). Clarified the
   "lightweight" backfill caveat inline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pass-5 Copilot suggestion to switch pr_limit from `type: string`
to `type: number` looked correct in isolation but broke the established
consumer pattern. Consumer YAMLs pass

  pr_limit: ${{ inputs.pr_limit || '20' }}

which evaluates to a STRING expression. workflow_call's `type: number`
validation rejects it at call time, so the reusable workflow never
starts (failure with zero jobs run). Caught by smsticket-ios's PR #86
run 25721524846 going red with no logs.

Revert to `type: string`. The snapshot script's history mode already
validates the value is a positive integer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pass-3 Copilot suggestion to emit one `--type LANG` per value broke
tokei at runtime:

    error: The argument '--type <types>' was provided more than once,
    but cannot be used multiple times

Tokei's CLI takes one `--type` flag whose value is a comma-separated
list of languages (`--type Swift,Objective-C`), not repeated flags.
The original code (which Copilot claimed didn't filter correctly) was
actually right.

Two reviewer-loop regressions in a row from over-trusting the bot's
suggestions verbatim — both this commit and 2fa7d15 (pr_limit type)
reverse review-driven changes that broke real consumers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`sep=$'\x1f'` lived inside an outer `bash -lc '…'` single-quoted block.
ANSI-C quoting needs `$'...'` literally; the `'` after `$` was closing
the outer quote, so the inner bash actually saw `$\x1f` (interpreted
as a reference to an unbound `x1f` variable). Under `set -u`, every
backfill run died with:

    bash: line 23: x1f: unbound variable

Caught by slovak-telekom-young-kmp PR's backfill workflow (run
25724033946).

Build the separator with `sep=$(printf "\x1f")` instead — printf
interprets `\xNN` regardless of quote style, and the inner DOUBLE
quotes don't conflict with the outer single-quoted heredoc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consumers outside the futuredapp GHCR access boundary (personal accounts,
other orgs) can't pull the private fork image. Adding a backwards-compatible
input lets them point at upstream maibornwolff/dependacharta-analysis on
Docker Hub for non-Swift projects.

Inlined the image-ref construction at both call sites and removed the
workflow-level env that just aliased a single value; the input expression
itself is now the source of truth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The retry loop does `git reset --hard origin/<branch>` and re-runs
apply_change. If the winning concurrent push only created one of
{previews, history} (e.g. a preview run wins, then a history run loses
and resets), the loser's local directory for the OTHER kind is gone and
the subsequent `cp` fails with "No such file or directory" — killing
the loop on attempt 2 due to set -euo pipefail.

The script's own comment claims apply_change is idempotent; hoisting
mkdir out of the function quietly broke that invariant. Moving it back
inside restores idempotency at no cost (mkdir -p is a no-op when the
directory exists).

Triggered in the wild during a PR-sync + push-to-develop race on a
consumer repo. After the fix, the second-attempt push succeeds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants