Skip to content

Decompress function runtime tarballs once when loading#127

Merged
negz merged 2 commits into
crossplane:mainfrom
negz:decompress-once-shame-on-you
Jun 20, 2026
Merged

Decompress function runtime tarballs once when loading#127
negz merged 2 commits into
crossplane:mainfrom
negz:decompress-once-shame-on-you

Conversation

@negz

@negz negz commented Jun 18, 2026

Copy link
Copy Markdown
Member

Description of your changes

#24 added support for gzipped function runtime image tarballs by streaming each one through gzip.NewReader directly into go-containerregistry's tarball.Image.

go-containerregistry calls the tarball.Opener it's given once per layer, plus once each for the manifest and config. Because the gzip opener re-opened and re-decompressed the whole file from the start on every call, loading a single image decompressed it once per layer.

My project uses Nix's dockerTools, which emits one layer per store path, so a typical function image has ~50 layers and was fully gunzipped ~54 times. Computing the image digest then re-reads every layer again. With functions built concurrently, a project with a dozen multi-arch functions spent over ten minutes pegging every core in this loop.

This PR decompresses each gzipped tarball once into a temporary file and serves every opener call from that plain tar. The temporary files back the returned images lazily, so they must outlive Build; the builder now creates them under a per-build temporary directory and exposes a Close method that removes the directory once the caller has finished consuming the images. NewBuilder returns the concrete *realBuilder so callers can defer Close, and the build, run, and render entry points do so after they have written, sideloaded, or loaded the images.

On a project with twelve functions built for amd64 and arm64, loading all twenty-four images drops from over ten minutes to roughly eighty seconds.

This follows on from #21 and #24, which introduced pre-built and gzipped function runtime tarball support respectively.

Fixes #

I have:

@negz negz force-pushed the decompress-once-shame-on-you branch 6 times, most recently from dcb275a to 0ee8d33 Compare June 19, 2026 19:29
PR crossplane#24 added support for gzipped function runtime image tarballs by
streaming each one through gzip.NewReader directly into
go-containerregistry's tarball.Image, writing no temporary files.

go-containerregistry calls the tarball.Opener it's given once per layer,
plus once each for the manifest and config. Because the gzip opener
re-opened and re-decompressed the whole file from the start on every
call, loading a single image decompressed it once per layer. Nix's
dockerTools emits one layer per store path, so a typical function image
has ~50 layers and was fully gunzipped ~54 times. Computing the image
digest then re-reads every layer again. With functions built
concurrently, a project with a dozen multi-arch functions spent over ten
minutes pegging every core in this loop.

This change decompresses each gzipped tarball once into a temporary file
and serves every opener call from that plain tar, turning ~54 full
decompressions per image into one. The temporary files back the returned
images lazily, so they must outlive Build; the builder now creates them
under a per-build temporary directory and exposes a Close method that
removes the directory once the caller has finished consuming the images.
NewBuilder returns the concrete *realBuilder so callers can defer Close,
and the build, run, and render entry points do so after they have
written, sideloaded, or loaded the images.

On a project with twelve functions built for amd64 and arm64, loading
all twenty-four images drops from over ten minutes to roughly eighty
seconds.

Signed-off-by: Nic Cope <nicc@rk0n.org>
@negz negz force-pushed the decompress-once-shame-on-you branch from 0ee8d33 to 4e6953a Compare June 19, 2026 22:04
Comment thread cmd/crossplane/project/build.go
Comment thread internal/project/build.go
@negz negz marked this pull request as ready for review June 19, 2026 22:08
@negz negz requested review from a team, jcogilvie and tampakrap as code owners June 19, 2026 22:08
@negz negz requested review from bobh66 and removed request for a team June 19, 2026 22:08
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3f5e374b-8675-4c59-91b4-8bcc046df63e

📥 Commits

Reviewing files that changed from the base of the PR and between 4e6953a and 872de41.

📒 Files selected for processing (1)
  • internal/project/build.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/project/build.go

📝 Walkthrough

Walkthrough

The PR refactors internal/project/Builder from an interface-backed internal struct into a concrete exported struct, adds a BuildWithTempDir option, and implements gzip tarball decompression into a caller-supplied temp directory for lazy image reads. All four command entry points (project build, project run, render op, render xr) are updated to create, defer-cleanup, and pass a temp directory into the builder.

Changes

Temp-Dir Tarball Decompression

Layer / File(s) Summary
Builder struct refactor and BuildWithTempDir option
internal/project/build.go
Replaces the exported Builder interface with a concrete exported Builder struct; retargets all BuilderOption functions to mutate *Builder; adds BuildWithTempDir to store the decompression directory; updates NewBuilder to return *Builder.
Gzip tarball decompression into temp directory
internal/project/build.go
Threads b.tempDir through runtimeImages, loadTarballRuntime, and loadRuntimeImage; standardizes opener signature to accept tempDir; refactors fsOpener and gzipOpener so gzipped tarballs are decompressed once into a temp file via writeTempTarball and read lazily; updates method receivers to *Builder.
Command entry points: temp dir creation and wiring
cmd/crossplane/project/build.go, cmd/crossplane/project/run.go, cmd/crossplane/render/op/cmd.go, cmd/crossplane/render/xr/cmd.go
Each command now calls os.MkdirTemp, defers os.RemoveAll, and passes the directory into project.NewBuilder via project.BuildWithTempDir.
Tests for gzip loading and builder refactor
internal/project/build_test.go
Updates test imports to support gzip and filesystem operations; refactors existing builder tests to use an explicit b := NewBuilder(...) variable; adds TestLoadRuntimeImageGzip to verify plain vs gzip tarball digest equality and decompression into tempDir; refactors runtimeImageForArch to random multi-layer images and updates writeRuntimeTar helpers.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • crossplane/cli#10: The project builder (internal/project/build.go) that this PR refactors and extends with BuildWithTempDir was originally introduced in this PR.

Suggested reviewers

  • tampakrap
  • jcogilvie
  • phisco
🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately describes the main change: decompressing gzipped function runtime tarballs once instead of repeatedly.
Description check ✅ Passed The description is well-detailed and directly related to the changeset, explaining the problem, solution, and performance improvements achieved.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Breaking Changes ✅ Passed All modifications to cmd/** files are internal implementation details (temporary directory management) that don't alter public fields/flags. The apis/** directory was not modified. No breaking chan...
Feature Gate Requirement ✅ Passed PR does not introduce experimental features affecting apis/** or user-facing behavior changes requiring feature flags; it's an internal performance optimization for gzipped tarball loading.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/project/build_test.go (1)

605-670: ⚡ Quick win

Please keep the new gzip runtime test table-driven for consistency.

Could we reshape TestLoadRuntimeImageGzip into an args/want table (even with one initial case) and add a brief reason field, to match the repository’s test conventions and keep future case expansion straightforward? Thanks for the solid coverage here.

As per coding guidelines, **/*_test.go: “Enforce table-driven test structure: ... args/want pattern ... proper test case naming and reason fields.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/project/build_test.go` around lines 605 - 670, Refactor
TestLoadRuntimeImageGzip into a table-driven test structure to match repository
conventions. Create a slice of test case structs with fields for args
(containing the tarball type and architecture), want (the expected outcome), and
reason (explaining what the test validates). Then wrap the existing test logic
in a loop that iterates through the test cases, extracting the appropriate
values from each case's args and want fields. Even though there is currently
only one test case, this table-driven structure will provide consistency with
the codebase's testing patterns and make it straightforward to add additional
test cases in the future.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/project/build_test.go`:
- Around line 605-670: Refactor TestLoadRuntimeImageGzip into a table-driven
test structure to match repository conventions. Create a slice of test case
structs with fields for args (containing the tarball type and architecture),
want (the expected outcome), and reason (explaining what the test validates).
Then wrap the existing test logic in a loop that iterates through the test
cases, extracting the appropriate values from each case's args and want fields.
Even though there is currently only one test case, this table-driven structure
will provide consistency with the codebase's testing patterns and make it
straightforward to add additional test cases in the future.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0ae9627d-b006-4b00-a65e-065b5935c53b

📥 Commits

Reviewing files that changed from the base of the PR and between d9c3b53 and 4e6953a.

📒 Files selected for processing (6)
  • cmd/crossplane/project/build.go
  • cmd/crossplane/project/run.go
  • cmd/crossplane/render/op/cmd.go
  • cmd/crossplane/render/xr/cmd.go
  • internal/project/build.go
  • internal/project/build_test.go

@adamwg adamwg left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment inline, but this lgtm overall.

Comment thread internal/project/build.go Outdated
Co-authored-by: Adam Wolfe Gordon <awg+github@xvx.ca>
Signed-off-by: Nic Cope <nicc@rk0n.org>
@negz negz merged commit 25f5681 into crossplane:main Jun 20, 2026
10 checks passed
@negz negz deleted the decompress-once-shame-on-you branch June 20, 2026 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants