Skip to content

feat: add deterministic tarball extract and repack utility#223

Open
Tonisal-byte wants to merge 7 commits into
microsoft:mainfrom
Tonisal-byte:asalinas/tarball-utils
Open

feat: add deterministic tarball extract and repack utility#223
Tonisal-byte wants to merge 7 commits into
microsoft:mainfrom
Tonisal-byte:asalinas/tarball-utils

Conversation

@Tonisal-byte
Copy link
Copy Markdown
Contributor

Adds a tarball utility package for deterministic extraction and repacking of source archives.

This is part 1 of 2 in the tarball overlay feature stack.

Add internal/utils/tarball package providing:
- DetectCompression: detect archive compression from filename
- Extract: decompress and extract tar archives (gzip, bzip2, xz, zstd)
- RepackDeterministic: create byte-reproducible archives with pinned
  timestamps, zeroed owner/group, GNU format, and sorted entries
- ResolveExtractRoot: find single top-level directory in extracted tree

Designed for reproducible builds, matching the tar --sort=name --mtime=@0
--owner=0 --group=0 --format=gnu convention used by source modification
scripts in the Azure Linux project.
Copilot AI review requested due to automatic review settings June 3, 2026 17:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new internal/utils/tarball utility to extract and deterministically repack tar archives for reproducible builds.

Changes:

  • Introduces deterministic repacking (stable ordering, fixed timestamps, zeroed ownership metadata) and multi-format compression handling.
  • Adds extraction utilities with basic path traversal rejection and extract-root resolution.
  • Adds Go tests covering compression detection, extract-root resolution, and a gzip extract/repack roundtrip.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
internal/utils/tarball/tarball.go Implements tar extraction + deterministic repacking with compression helpers.
internal/utils/tarball/tarball_test.go Adds unit tests for compression detection, extract-root logic, and gzip roundtrip determinism.

Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread internal/utils/tarball/tarball_test.go Outdated
Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread internal/utils/tarball/tarball.go Fixed
Comment thread internal/utils/tarball/tarball.go Fixed
Comment thread internal/utils/tarball/tarball.go Fixed
@reubeno
Copy link
Copy Markdown
Member

reubeno commented Jun 3, 2026

@Tonisal-byte -- I left some feedback on the instance of this in the other PR. My request would be not to duplicate the changes between the two, or leave the later ones in draft.

I know GitHub stacked PRs will help with some of that, but for now it's hard to know which PR to leave feedback on which parts of the code.

@Tonisal-byte Tonisal-byte requested a review from Copilot June 3, 2026 22:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.

Comment thread internal/utils/archive/archive.go
Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread go.mod Outdated
Comment thread internal/utils/tarball/tarball.go Outdated
@Tonisal-byte Tonisal-byte requested a review from Copilot June 3, 2026 23:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread internal/utils/archive/archive.go
Comment thread internal/utils/archive/archive.go
Comment thread internal/utils/archive/archive.go
@Tonisal-byte Tonisal-byte requested a review from Copilot June 3, 2026 23:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread internal/utils/archive/archive.go
Comment thread internal/utils/archive/archive_test.go
@Tonisal-byte Tonisal-byte requested a review from Copilot June 4, 2026 16:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread internal/utils/tarball/tarball.go Outdated
Comment thread internal/utils/archive/archive_internal_test.go
@Tonisal-byte Tonisal-byte force-pushed the asalinas/tarball-utils branch from a5e505a to d5d7c0f Compare June 4, 2026 17:18
Copilot AI review requested due to automatic review settings June 4, 2026 21:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

Comment thread internal/utils/archive/archive.go
Comment thread internal/utils/archive/archive.go
Comment thread internal/utils/archive/archive_test.go Outdated
Comment thread internal/utils/archive/archive_test.go
Comment thread internal/utils/archive/archive.go
@Tonisal-byte Tonisal-byte requested a review from Copilot June 4, 2026 21:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread internal/utils/archive/archive_test.go
content = string(body)
}

entriesByName[header.Name] = entryInfo{header: header, content: content}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, going to ignore

Comment thread internal/utils/archive/archive_test.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants