index: move chunk index to index/ namespace, drop hash seed#9778
Merged
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9778 +/- ##
==========================================
- Coverage 84.87% 84.87% -0.01%
==========================================
Files 92 92
Lines 15165 15162 -3
Branches 2271 2270 -1
==========================================
- Hits 12872 12868 -4
- Misses 1589 1590 +1
Partials 704 704 ☔ View full report in Codecov by Harness. |
ThomasWaldmann
requested changes
Jun 15, 2026
Name objects by pure sha256 so borgstore can verify them. closes borgbackup#9758
fce680e to
eb30123
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The repo-side chunk index used to live under
cache/chunks.<hash>, with the name computed assha256(content + CHUNKINDEX_HASH_SEED). The seed was a manual invalidation knob: bump it and every old index file stops matching its own name, so clients drop it.Two problems with that. First, the index is not really a cache. Rebuilding it means rescanning the whole repository, which is slow and expensive, so it is closer to permanent repository data than to a throwaway cache. Second, the seed meant the name was not the plain
sha256of the content, so borgstore could not verify it by hash the way it verifies every other object.So this gives the chunk index its own
index/namespace and names each object by the plainsha256of its content:cache/chunks.<hash>becomesindex/<hash>, and the whole name is the content hash.MAGICandVERSIONheader and refuses to load a version it does not understand, so that header does the invalidation job the seed used to do.index/is registered as a real namespace, flat nesting likecache/, with write and delete permissions in theno-deleteandwrite-onlymodes. A backup writes an incremental index fragment and compact merges and removes old fragments, so both need write and delete to work.The
write/read/delete_chunkindex_*_repo_cachehelpers are renamed to*_repoto match.Two things I left out on purpose:
index/, rebuilds once from the repo, and leaves the oldcache/chunks.*files sitting there as harmless orphans. Fine for beta with fresh test repos.Changes:
cache.py: drop the seed, hash the content directly, store and read underindex/<hash>, list theindex/namespace, rename the helpers.repository.py: add theindex/namespace config and its permissions, update the call sites.archive.py,compact_cmd.py: rename call sites and fix comments.index/and rename the helper calls.closes #9758
Checklist
master(or maintenance branch if only applicable there)toxor the relevant test subset)