Initial LTFS order awarness functionality#1001
Open
hugo-hur wants to merge 8 commits into
Open
Conversation
Introduce the machinery for tape-aware transfers without yet changing the read order. On an LTFS volume each file's metadata -- including the block where its data begins -- lives in the volume index and is cheap to read, while reading content requires physically positioning the tape. Add a new ltfs.c module that reads a file's starting block from the ltfs.startblock virtual xattr (also honoring a user.ltfs.startblock alias so the feature can be exercised on an ordinary filesystem). The sender records that block into a new optional 64-bit file-list extra and transmits it, so the generator -- which on a local copy is chdir'd into the destination and cannot read the source xattrs itself -- receives each file's physical position. Wire up the --ltfs option: it allocates the extra, implies --whole-file (a delta transfer would re-read the source off tape anyway), forces --no-inc-recursive (the full list is needed before a read order can be chosen), and refuses --checksum (which would read the whole tape just to decide what to transfer). The option is propagated to the server side so both ends agree on the file-list extra layout. Reading the start block requires xattr support, so the whole feature is gated on SUPPORT_XATTRS: the file-list extra is only registered when xattrs are available, and a build without them refuses --ltfs (like --crtimes and friends) rather than silently accepting an inert option.
With the start block of every file now available on the generator side,
order the data-read phase by ascending block instead of by name when --ltfs
is in effect. The drive then makes a single forward streaming pass rather
than seeking back and forth ("shoe-shining"), which on a real tape can cut a
restore from hours to one pass.
Entries with no start block (directories, symlinks, anything not on tape)
sort first in their original order, which conveniently front-loads creation
of the destination directory tree before the bulk data read begins. A NULL
ordering (ltfs off, no metadata negotiated, or an empty range) falls back to
the natural low..high sweep.
Add the option summary line and a full description covering what LTFS ordering does, the options it implies (--whole-file, --no-inc-recursive) and refuses (--checksum), that the fast index metadata still drives the normal size+mtime quick check, and that only the read (restore) direction is optimized.
Exercise --ltfs on an ordinary filesystem via the user.ltfs.startblock alias, assigning blocks that run opposite to name order. The test verifies round-trip integrity, that the itemized output (rsync's observable processing order) comes out in physical block order across subdirectories with directories handled first, and that --ltfs --checksum is refused. Skips cleanly when the build lacks xattr support or the scratch filesystem rejects a user.* xattr.
--ltfs's value is that an LTFS source serves size and mtime from the tape index for free, letting rsync's quick check skip unchanged files without reading their content off the tape. That only works if the destination keeps the source mtimes: without -t, every run sees a time mismatch and re-reads the whole tape, defeating the purpose. Process --ltfs in the option loop (via OPT_LTFS) and set preserve_mtimes there, the same way --archive implies -t, so a later --no-times can still override it in option order. When it is overridden, warn that unchanged files can no longer be skipped, rather than silently doing the slow thing. Found while testing against a real LTO-5 LTFS volume: a bare -r --ltfs left current mtimes on the destination, so the next run wanted to re-read every file.
Note in the manpage that --ltfs enables mtime preservation (like --archive) so the index quick-check can skip unchanged files across runs, and that a later --no-times overrides it with a warning.
Verify that --ltfs preserves source mtimes without an explicit -t (so an immediate re-run finds nothing to transfer) and that an explicit --no-times still runs but emits a warning.
varesa
reviewed
Jun 12, 2026
| implies [`--whole-file`](#opt) (a delta transfer would re-read the source | ||
| file anyway) and forces [`--no-inc-recursive`](#opt) (the complete file | ||
| list is needed before the read order can be chosen). It also refuses | ||
| [`--checksum`](#opt), which would read every byte of every file off the |
There was a problem hiding this comment.
Something I commented offline but I'll mention here as well:
I would not refuse --checksum since there are use cases where one would like to ensure that the contents of e.g. an offsite mirror match what is on tape. The tape drive might be even two orders of magnitude faster than the internet link to an offsite archive it makes sense to
a) ensure matching checksums
b) minimize amount of data transferred over the internet link
an extra linear scan on a fast drive is a lot smaller issue than shoeshining
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
I'm running several LTFS formatted LTO tapes for incremental backups. Backupping to tape has been working fine with rsync (using the --whole-file and -t options) but reading from the tape is basically impossible due to the reasons stated below.
Summary
Adds a
--ltfsoption that makes rsync read an LTFS-tape source in physicalblock order instead of name order, so a restore streams the tape forward in a
single pass instead of seeking back and forth ("shoe-shining").
On an LTFS volume, each file's metadata — name, size, mtime, and the block where
its data begins — lives in the volume index and is cheap to read. File content,
however, requires physically positioning the tape. rsync's normal name-sorted
traversal bears no relation to physical layout, so an unordered restore can take
many times longer than one forward pass.
What it does
ltfs.startblockvirtual xattr (also honoring a
user.ltfs.startblockalias) and transmits itas a new optional 64-bit file-list extra.
with no start block (directories, symlinks) sort first, which front-loads
creation of the destination directory tree before the bulk data read.
Behavior & safeguards
--ltfsis opinionated about the things that would silently defeat it:--whole-file— a delta transfer would re-read the source offtape anyway.
--no-inc-recursive— the complete file list is needed before aread order can be chosen.
--checksum— it would read every byte of every file off the tapejust to decide what to transfer.
--times(just like--archive) — the whole benefit is that the indexserves size+mtime for free so unchanged files are skipped without reading
content; that only holds across runs if mtimes are preserved. A later
--no-timesoverrides it but emits a warning.Compatibility / requirements
SUPPORT_XATTRS: the file-list extra is only registered when xattrs areavailable, and a build without them refuses
--ltfs(like--crtimes)rather than silently no-op'ing.
orders the reads);
--ltfsis forwarded to the server side.Testing
testsuite/ltfs_test.pyexercises ordering on an ordinary filesystem viathe
user.ltfs.startblockalias: round-trip integrity, physical-block readorder across subdirectories (directories handled first), the implied
-t/--no-timeswarning, and that--checksumis refused. It skips cleanly whenthe build lacks xattr support or the platform/filesystem can't set a
user.*xattr (e.g. Windows, where
os.setxattris absent).--disable-xattr-support.--ltfsread three fileswhose start blocks ran opposite to name order in correct forward order
(verified via
straceon the drive), all MD5s matched the tape, and a second-tpass skipped every file from index metadata alone — zero content reads.Not in scope (possible follow-ups)