Skip to content

add a new UAPI.16 File Manifest spec (WIP)#213

Open
poettering wants to merge 1 commit into
uapi-group:mainfrom
poettering:uapi16-file-manifest
Open

add a new UAPI.16 File Manifest spec (WIP)#213
poettering wants to merge 1 commit into
uapi-group:mainfrom
poettering:uapi16-file-manifest

Conversation

@poettering
Copy link
Copy Markdown
Collaborator

No description provided.

@poettering poettering added do-not-merge The pull request must not be merged new-spec labels Apr 22, 2026
@poettering
Copy link
Copy Markdown
Collaborator Author

(mostly posted here to start the discussion)

@poettering poettering force-pushed the uapi16-file-manifest branch 4 times, most recently from 54f9001 to f4a161c Compare April 22, 2026 11:33
@bluca
Copy link
Copy Markdown
Member

bluca commented Apr 22, 2026

As mentioned elsewhere, it would be great if this could be embedded in existing json manifests, to avoid having to ship multiple ones, and consumers knew how to find it - essentially the existing mkosi manifest. If I understand correctly, the only thing needed for this to work is to establish an optional and well-known "key" under which this object can be found under a parent json object?

Comment thread specs/file-manifest.md Outdated
Comment thread specs/file_manifest.md
@Foxboron
Copy link
Copy Markdown
Member

Is the intention of this specc to solve this issue, or is this trying to solve a different problem?

#207

Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
@daandemeyer
Copy link
Copy Markdown
Member

daandemeyer commented Apr 22, 2026

@Foxboron Totally different problem, it's designed to replace SHA256SUMS for sysupdate to list remote resources

@Foxboron
Copy link
Copy Markdown
Member

Foxboron commented Apr 22, 2026

@daandemeyer Hmm, should the title be Sysupdate File Manifest spec to nail the usage a bit more down? Else we might end up with multiple "file manifest" specs?

Comment thread specs/file-manifest.md Outdated
Copy link
Copy Markdown
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to use semantic line breaks here. We agreed in general to do this in new documents.

Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
@poettering
Copy link
Copy Markdown
Collaborator Author

@daandemeyer Hmm, should the title be Sysupdate File Manifest spec to nail the usage a bit more down? Else we might end up with multiple "file manifest" specs?

I'd probably call the other spec an "Inode Spec", since that's more what it is. Or "File System Object" or so.

This one here is purely about files.

You are absolutely right, this might be confusing, but it will either way I am sure.

I am against naming this after the software that likely implements this, it's supposed to be a generic spec, independent of any specific implementation.

@poettering
Copy link
Copy Markdown
Collaborator Author

Hmm, we could also consider extending this spec to just cover what is requested in #207 too. I mean, if I grok this right it would just mean adding some more fields to encode UNIX inode properties in full: i.e. inodeType, mode, uid, gid, user, group, major, minor, symlinkTarget and so on. When using this for the download scenario we'd use ignore all these fields I guess, but I see nothing speaking against supporting this too.

@poettering poettering force-pushed the uapi16-file-manifest branch from f4a161c to 914c9f6 Compare April 22, 2026 14:54
@poettering
Copy link
Copy Markdown
Collaborator Author

Posted a new version covering all comments, but not trying to address #207 (would prefer if we did that in a later follow-up PR)

@bluca
Copy link
Copy Markdown
Member

bluca commented Apr 22, 2026

As mentioned elsewhere, it would be great if this could be embedded in existing json manifests, to avoid having to ship multiple ones, and consumers knew how to find it - essentially the existing mkosi manifest. If I understand correctly, the only thing needed for this to work is to establish an optional and well-known "key" under which this object can be found under a parent json object?

^^^ ?

@poettering
Copy link
Copy Markdown
Collaborator Author

As mentioned elsewhere, it would be great if this could be embedded in existing json manifests, to avoid having to ship multiple ones, and consumers knew how to find it - essentially the existing mkosi manifest. If I understand correctly, the only thing needed for this to work is to establish an optional and well-known "key" under which this object can be found under a parent json object?

i don't grok this request?

it seems to me that the manifest format here could easily be embedded by mkosi's package manifests if it wants file-level information. But that's something to decide and define in mkosi's format, it's not something we could dictate here?

@bluca
Copy link
Copy Markdown
Member

bluca commented Apr 22, 2026

As mentioned elsewhere, it would be great if this could be embedded in existing json manifests, to avoid having to ship multiple ones, and consumers knew how to find it - essentially the existing mkosi manifest. If I understand correctly, the only thing needed for this to work is to establish an optional and well-known "key" under which this object can be found under a parent json object?

i don't grok this request?

it seems to me that the manifest format here could easily be embedded by mkosi's package manifests if it wants file-level information. But that's something to decide and define in mkosi's format, it's not something we could dictate here?

Having a "suggested" key for the object would allow consumers to know what to search for, without having to come up with one for each case. Just like there's a "suggested" filename for the file in the spec?

@pothos
Copy link
Copy Markdown
Member

pothos commented Apr 22, 2026

I wouldn't merge this into the mkosi manifest but treat it like the SHA256SUMS file mkosi can generate with --checksum= because this update manifest is supposed to have a specific file name that sysupdate can look for while the mkosi manifest includes the version in its name and thus can't be used as is. So --uapi-file-manifest= would write it and then --sign= would do the gpg signing for this file like it's currently done for SHA256SUMS.

@bluca
Copy link
Copy Markdown
Member

bluca commented Apr 22, 2026

Nah it has to be in the same file, at least as an option, as I most definitely do not want to have to deal with having to publish yet another file

Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
Comment thread specs/file-manifest.md Outdated
@AdrianVovk
Copy link
Copy Markdown

Nah it has to be in the same file, at least as an option, as I most definitely do not want to have to deal with having to publish yet another file

Sorry I'm not groking how you can avoid this. mkosi spits out a manifest listing the specific packages it included in a specific image that it built. This spec is about listing all the files in the directory, and in the sysupdate usecase that means across version boundaries. So if anything, the mkosi manifest would have to be embedded in here (describing mkosi's output files) rather than the other way around. But that also doesn't make sense because even if mkosi spits out multiple files you'll get just one manifest, right?

Anyway, this isn't "yet another file" to publish since it'll replace SHA256SUMS

Comment thread specs/file-manifest.md Outdated
Comment thread specs/file_manifest.md
Comment thread specs/file-manifest.md Outdated
`sha256` is the SHA256 hash of the specified slice, formatted in 64 hexadecimal characters. Parsers should
parse this case-insensitively.

The `gpt*` fields encode fields that we need when placing these resources in a GPT partition table entry. The
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that the gpt* fields belong to the file object. The gpt data of one object can contradict another one and there are not clear rules about how to resolve them.

Maybe is part of the manifest root? Maybe part of a different UAPI 16 Container Manifest spec?

Copy link
Copy Markdown
Collaborator Author

@poettering poettering Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm? not sure i follow? how would they contradict? each other? these are gpt per-partition fields?

@bluca
Copy link
Copy Markdown
Member

bluca commented Apr 23, 2026

Nah it has to be in the same file, at least as an option, as I most definitely do not want to have to deal with having to publish yet another file

Sorry I'm not groking how you can avoid this. mkosi spits out a manifest listing the specific packages it included in a specific image that it built. This spec is about listing all the files in the directory, and in the sysupdate usecase that means across version boundaries. So if anything, the mkosi manifest would have to be embedded in here (describing mkosi's output files) rather than the other way around. But that also doesn't make sense because even if mkosi spits out multiple files you'll get just one manifest, right?

The "top-level" build is mkosi, and it's a single mkos build that produces these artifacts. Just like it right now includes extensions DDI metadata in the mkosi manifest, it can also include this metadata.
The other way around would break existing consumers, which would suddenly not find the objects they are looking for anymore. New consumers looking for the new manifest would have no problem though, as they are new.

Anyway, this isn't "yet another file" to publish since it'll replace SHA256SUMS

Well it is, as it's new

@poettering
Copy link
Copy Markdown
Collaborator Author

Having a "suggested" key for the object would allow consumers to know what to search for, without having to come up with one for each case. Just like there's a "suggested" filename for the file in the spec?

I am not convinced that is necessary. For example, let's say we do something to address #207. The way I see this happen would be that for directories they'd contain an additional contents field in the file object that itself is again a manifest object. i.e. you'd nest them nicely. I think it would be really weird to make that field named "Uapi16ManifestFile" just because. It's the contents of the dir, and hence it should be named that way.

@poettering
Copy link
Copy Markdown
Collaborator Author

so i wonder if we should maybe fold the stepping stone and revoked thing into a single "updatePolicy" field or so, giving that a revoked item should never be a stepping stone. And given this state is relevant only when using this manifests for sysupdate-style updaters I think it makes sense to make that clear in the name. Hence:

updatePolicy would be an enum with values good, stepping-stone, revoked:

  • good → the default for entries that are listed but do not have the field updatePolicy set. An updater should consider this entry as a great, regular update target.

  • stepping-stone → almost like good, but an updater should never skip over such an entry.

  • revoked → this is a tainted version, please upgrade asap, and downgrade if there's no upgrade

  • and then we could define one additional value vanished → this is a pseudo-state for items that where downloaded before but no longer appear in the manifest. In most contexts it should be treated just like revoked.

With this we'd have a single field encoding the whole policy, and it would be self-contained inside the file object (which is a property I like very much)

Comment thread specs/file-manifest.md Outdated
@poettering poettering force-pushed the uapi16-file-manifest branch from 914c9f6 to c4f8c09 Compare May 7, 2026 15:35
@poettering
Copy link
Copy Markdown
Collaborator Author

I pushed a new version, taking a lot of input into account.

I made three fundamental changes:

  1. Instead of putting everything in a giant JSON array, I changed the thing to be a JSON-SEQ sequence instead. This takes @brauner's point regarding scalability into account: by making this JSON-SEQ the format can be read and processed in a stream fashion. if we use a giant json array this is typically not possible, most parsers would read the whole thing into memory. While that might be fine for certain usescases it certainly doesn't scaled to huge sizes. I opted for JSON-SEQ, since it's RFC specified and in contrast to other JSON sequence formats is fine with pretty printed JSON, and gives us clear markers where to put cut lines for cryptographic signatures.

  2. the thing can now be used as a manifest for deeply nested file hierarchies, with full POSIX properties. Instead of handwavingly claiming that we can extend the format that way (as I did above) I now made the spec actually cover this fully, because it's not as trivial as I originally though.

  3. The contents stuff now supports alternative sources as proposed by @cyphar.

I think the outcome is quite nice and comprehensive. Of course JSON-SEQ means it is not just a text stream anymore, but I think the benefits outweigh the negatives here.

Anyway, please have a look.

I intend to put together some code in systemd to see how this feels when actually implementing this.

@poettering poettering force-pushed the uapi16-file-manifest branch 3 times, most recently from 0f94586 to 9bb5013 Compare May 8, 2026 07:03
poettering added a commit to poettering/systemd that referenced this pull request May 8, 2026
@poettering poettering force-pushed the uapi16-file-manifest branch from 9bb5013 to 8dd1df6 Compare May 8, 2026 08:16
@poettering
Copy link
Copy Markdown
Collaborator Author

I now prepped a patch for systemd's systemd-dissect tool (which already has --mtree), to generate a manifest from a directory tree. It looks really nice, and the patch is quite small actually:

systemd/systemd#41990

Note that the output it generates is not really intended for systemd-sysupdate consumption. It includes uid/gid info after all (just to match the --mtree call), and for sysupdate that's unlikely what we want.

@poettering
Copy link
Copy Markdown
Collaborator Author

btw, in case you wonder, jq can process json-seq with the --seq switch

poettering added a commit to poettering/systemd that referenced this pull request May 10, 2026
@poettering
Copy link
Copy Markdown
Collaborator Author

btw, for illustrative purposes, this is how a real-life /usr/include/ looks like in the currently described format:

https://paste.centos.org/view/raw/446a6f8e

it has dirs, regular files and symlinks.

I find that really readable with the naked human eye.

poettering added a commit to poettering/systemd that referenced this pull request May 11, 2026
poettering added a commit to poettering/systemd that referenced this pull request May 11, 2026
poettering added a commit to poettering/systemd that referenced this pull request May 11, 2026
Comment thread specs/file_manifest.md

The `mode` (unsigned integer) field encodes the UNIX access mode of the file object. It applies to all
inode types, except `lnk`. Note that while UNIX access modes are typically written in octal, this one is
encoded in a regular JSON number, i.e. decimal. The valid range is 0…4095 (i.e. `0o0000` to `0o7777`).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also how Ignition does it and with some tooling it's okay but when writing/reading this still is very strange to deal with. I wonder if a 0o string wouldn't be the more natural embedding here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe modeDec to indicate decimal? 🤔

Comment thread specs/file_manifest.md
6. Otherwise, if `url` is set, the data from the URL should be acquired, continue in step 8.
7. Otherwise, if `file` is set, the data should be acquired from the same URL as the manifest itself, however with the last component of the URL replaced by the file name encoded in `file`, following the same semantics as HTML relative links. Continue in step 8.
8. If both `encodedSize` and `encoding` are set the size of the acquired data shall be checked against `encodedSize`.
9. If `encodedSize` is not set, but `originalSize` is: the size of the acquired data shall be checked against `originalSize`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
9. If `encodedSize` is not set, but `originalSize` is: the size of the acquired data shall be checked against `originalSize`.
9. If `encodedSize` is not set, but `originalSize` is, the size of the acquired data shall be checked against `originalSize`.

Might match the previous sentence structure better

Comment thread specs/file_manifest.md
Comment on lines +358 to +359
10. If `encoding` is set: the acquired data shall be decoded according to the algorithm indicated in `encoding`.
11. If `encoding` and `originalSize` are set: the resulting decoded data shall be checked against `originalSize`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
10. If `encoding` is set: the acquired data shall be decoded according to the algorithm indicated in `encoding`.
11. If `encoding` and `originalSize` are set: the resulting decoded data shall be checked against `originalSize`.
10. If `encoding` is set, the acquired data shall be decoded according to the algorithm indicated in `encoding`.
11. If `encoding` and `originalSize` are set, the resulting decoded data shall be checked against `originalSize`.

Copy link
Copy Markdown
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the primary problem with the current text is that it aims for reproducibility, but leaves a bunch of implementation choices undefined:

  • sorting of entries
  • sorting within an entry
  • null or omitted fields
  • whitespace formatting of json

Comment thread specs/file_manifest.md
Comment on lines +19 to +20
This format stores information about static, immutable data resources that potentially can be acquired over
the network or stored on a disk. It's in particular supposed to be able to reasonably represent the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first sentence, "static" and "immutable" are both bogus. We have no control over those files, so whether they are static and immutable is completely out of scope.

The mtree man page says "The mtree format is a textual format that describes a collection of filesystem objects". That seems much more to the point. Maybe "This document describes a textual file manifest format that describes a collection of blobs. Those blobs can potentially be acquired over the network or stored on a disk. In particular, it can represent the properties of files arranged in a POSIX file system tree as well as partitions in a GPT partition table."

Comment thread specs/file_manifest.md
2. It allows extracting data "slices" from data sources, via offset and range.
3. Various fields of additional per-file metadata may be defined, including various fields for GPT partition table metadata.
4. It's an extensible format permitting vendors and projects to add their own per-file and per-manifest fields.
5. Inline cryptographic signing.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1–4 are sentences with a subject, 5 should have one too.

Comment thread specs/file_manifest.md
Comment on lines +32 to +34
This format is designed with tools such as `mkosi` and `systemd-sysupdate` in mind, but it's intended to be
generally useful as a way to describe collections of files that may potentially be acquired over the network
or from local storage, and verified cryptographically.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repeats what is in the intro.

Comment thread specs/file_manifest.md
Comment on lines +50 to +52
`mediaType` field set (which identifies the sequence as a UAPI.16 manifest). File objects for directory
entries `Uapi16Manifest` (or any filename with a prefix of `Uapi16Manifest.`), `.`, `..` or any path ending
in `/.`, `/..` are not permitted.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This second part should be a separate paragraph, since it describes a completely separate topic from the first part.

Comment thread specs/file_manifest.md
Comment on lines +54 to +55
When applied to a GPT partition table the first object may also describe fields in the overall GPT
partition table header of the disk.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so the preceding para says that the first object must describe "a … directory". But then this sentence here contradicts that, and says that it may describe a GPT header.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be reworded to say that the first object identifies the file format as a whole and may be describe either a directory or a GPT partition header.

Comment thread specs/file_manifest.md

The `name` must be a valid UTF-8 POSIX path (i.e. using the slash `/` as component separator), may not
contain control characters (ASCII 0…31, 127) and may not contain "`.`", "`..`", "``" as components of the
path. It may not start nor end with a slash, nor may it have multiple slashes in immediate sequence (or in other
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"or" (There are two valid forms: "It may neither start nor end with a slash" and "It may not start or end with a slash".)

Comment thread specs/file_manifest.md
is the size of a UAPI.16 file manifest generated for the subdirectory (which only applies if the directory
contents are not provided in-line, but via a reference to a separate sub-manifest file, see above).

The `major` (unsigned integer) field only applies to file objects of type `blk` and `chr`, it encodes the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"only applies to" → "can only be specified for object of type …

Comment thread specs/file_manifest.md
with either decoding vocabulary.

`encoding` specifies the encoding of the specified resource. It accepts the same encoding specifiers as
HTTP's `Content-Encoding:` field, i.e. `gzip` or `zstd`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"i.e." → "e.g." ?

Comment thread specs/file_manifest.md
Comment on lines +350 to +351
2. If `validAfterUSec` is set and greater than the current time, the download should immediately fail.
3. If `validBeforeUSec` is set and lower than the current time, the download should immediately fail.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why not "since" and "until"? "after" and "before" as described here doesn't match the description: "after"/"before" implies that the boundaries are excluded, while the description specifies that they are included in the valid range. In practice this doesn't matter much, but is inelegant.

Comment thread specs/file_manifest.md
directory object), and must have the value `"application/vnd.uapi.16.manifest"`. It should not be used on
any other file object in the sequence.

If `name` is not specified the record stores information about the top-level root file object. This file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says that name is first, but I don't see the order of other entries described anywhere.

Copy link
Copy Markdown
Member

@Foxboron Foxboron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But generally I think there is an over-reliance on paranthesis for what should be properly part of the paragraphs.

Comment thread specs/file_manifest.md
Comment on lines +58 to +62
immediately following the file object of the subdirectory object. (Even if ordering within directories is
not mandated, it is definitely a good idea to sort the entries alphabetically, or by version
sort. Alternatively, historical ordering – i.e. add new additions purely at the end – might be a good
idea. Sorting entries ensures the manifests become reproducible as long as the same tool is used, which is
definitely a welcome property.)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parenthesis here encompassess too much text to be a note/sidenote thing. So preferably remove them or structure this another way?

Comment thread specs/file_manifest.md
idea. Sorting entries ensures the manifests become reproducible as long as the same tool is used, which is
definitely a welcome property.)

The `name` fields of file objects must be uniquely assigned within a manifest file.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can have singular here instead of plural? This harmonizes better with the next paragraph IMO.

Comment thread specs/file_manifest.md
without file objects of type `dir`), subdirectory listings may either be specified inline (typically
preferable) or be imported via a separate "sub-manifest" file covering just this subdirectory (and potential
sub-subdirectories, …), by referencing the separate UAPI.16 manifest file in the `contents` field of the file
object.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the paranthesis sectionts here can just be replaced by commas or rewritten to not require the paranthesis.

Something liiiikkeee this maybe?

For manifests convering a direcotry tree with nested directories, as opposed to a flat direcotry listing without file
objects of type `dir`, it's preferably to specify the subdirectory listing inline or be imported bia a seperate
"sub-manifest" file covering just this subdirectory, by referencing the seperate UAPI.16 manifest file in the `contets`
field of the file object.

@Foxboron
Copy link
Copy Markdown
Member

@keszybz The reproducible builds aspect is a good catch. The file listing should be ordered in some declared way. Probably alphabetical order?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge The pull request must not be merged new-spec

Development

Successfully merging this pull request may close these issues.