add a new UAPI.16 File Manifest spec (WIP)#213
Conversation
|
(mostly posted here to start the discussion) |
54f9001 to
f4a161c
Compare
|
As mentioned elsewhere, it would be great if this could be embedded in existing json manifests, to avoid having to ship multiple ones, and consumers knew how to find it - essentially the existing mkosi manifest. If I understand correctly, the only thing needed for this to work is to establish an optional and well-known "key" under which this object can be found under a parent json object? |
|
Is the intention of this specc to solve this issue, or is this trying to solve a different problem? |
|
@Foxboron Totally different problem, it's designed to replace SHA256SUMS for sysupdate to list remote resources |
|
@daandemeyer Hmm, should the title be |
keszybz
left a comment
There was a problem hiding this comment.
I think it'd be nice to use semantic line breaks here. We agreed in general to do this in new documents.
I'd probably call the other spec an "Inode Spec", since that's more what it is. Or "File System Object" or so. This one here is purely about files. You are absolutely right, this might be confusing, but it will either way I am sure. I am against naming this after the software that likely implements this, it's supposed to be a generic spec, independent of any specific implementation. |
|
Hmm, we could also consider extending this spec to just cover what is requested in #207 too. I mean, if I grok this right it would just mean adding some more fields to encode UNIX inode properties in full: i.e. |
f4a161c to
914c9f6
Compare
|
Posted a new version covering all comments, but not trying to address #207 (would prefer if we did that in a later follow-up PR) |
^^^ ? |
i don't grok this request? it seems to me that the manifest format here could easily be embedded by mkosi's package manifests if it wants file-level information. But that's something to decide and define in mkosi's format, it's not something we could dictate here? |
Having a "suggested" key for the object would allow consumers to know what to search for, without having to come up with one for each case. Just like there's a "suggested" filename for the file in the spec? |
|
I wouldn't merge this into the mkosi manifest but treat it like the SHA256SUMS file mkosi can generate with |
|
Nah it has to be in the same file, at least as an option, as I most definitely do not want to have to deal with having to publish yet another file |
Sorry I'm not groking how you can avoid this. Anyway, this isn't "yet another file" to publish since it'll replace SHA256SUMS |
| `sha256` is the SHA256 hash of the specified slice, formatted in 64 hexadecimal characters. Parsers should | ||
| parse this case-insensitively. | ||
|
|
||
| The `gpt*` fields encode fields that we need when placing these resources in a GPT partition table entry. The |
There was a problem hiding this comment.
I am not sure that the gpt* fields belong to the file object. The gpt data of one object can contradict another one and there are not clear rules about how to resolve them.
Maybe is part of the manifest root? Maybe part of a different UAPI 16 Container Manifest spec?
There was a problem hiding this comment.
hmm? not sure i follow? how would they contradict? each other? these are gpt per-partition fields?
The "top-level" build is mkosi, and it's a single mkos build that produces these artifacts. Just like it right now includes extensions DDI metadata in the mkosi manifest, it can also include this metadata.
Well it is, as it's new |
I am not convinced that is necessary. For example, let's say we do something to address #207. The way I see this happen would be that for directories they'd contain an additional |
|
so i wonder if we should maybe fold the stepping stone and revoked thing into a single "updatePolicy" field or so, giving that a revoked item should never be a stepping stone. And given this state is relevant only when using this manifests for sysupdate-style updaters I think it makes sense to make that clear in the name. Hence:
With this we'd have a single field encoding the whole policy, and it would be self-contained inside the file object (which is a property I like very much) |
914c9f6 to
c4f8c09
Compare
|
I pushed a new version, taking a lot of input into account. I made three fundamental changes:
I think the outcome is quite nice and comprehensive. Of course JSON-SEQ means it is not just a text stream anymore, but I think the benefits outweigh the negatives here. Anyway, please have a look. I intend to put together some code in systemd to see how this feels when actually implementing this. |
0f94586 to
9bb5013
Compare
…om a directory tree The UAPI.16 is being discussed here: uapi-group/specifications#213
9bb5013 to
8dd1df6
Compare
|
I now prepped a patch for systemd's Note that the output it generates is not really intended for systemd-sysupdate consumption. It includes uid/gid info after all (just to match the --mtree call), and for sysupdate that's unlikely what we want. |
|
btw, in case you wonder, jq can process json-seq with the --seq switch |
…om a directory tree The UAPI.16 is being discussed here: uapi-group/specifications#213
|
btw, for illustrative purposes, this is how a real-life /usr/include/ looks like in the currently described format: https://paste.centos.org/view/raw/446a6f8e it has dirs, regular files and symlinks. I find that really readable with the naked human eye. |
…om a directory tree The UAPI.16 spec is being discussed here: uapi-group/specifications#213
…om a directory tree The UAPI.16 spec is being discussed here: uapi-group/specifications#213
…om a directory tree The UAPI.16 spec is being discussed here: uapi-group/specifications#213
|
|
||
| The `mode` (unsigned integer) field encodes the UNIX access mode of the file object. It applies to all | ||
| inode types, except `lnk`. Note that while UNIX access modes are typically written in octal, this one is | ||
| encoded in a regular JSON number, i.e. decimal. The valid range is 0…4095 (i.e. `0o0000` to `0o7777`). |
There was a problem hiding this comment.
That's also how Ignition does it and with some tooling it's okay but when writing/reading this still is very strange to deal with. I wonder if a 0o string wouldn't be the more natural embedding here.
There was a problem hiding this comment.
Maybe modeDec to indicate decimal? 🤔
| 6. Otherwise, if `url` is set, the data from the URL should be acquired, continue in step 8. | ||
| 7. Otherwise, if `file` is set, the data should be acquired from the same URL as the manifest itself, however with the last component of the URL replaced by the file name encoded in `file`, following the same semantics as HTML relative links. Continue in step 8. | ||
| 8. If both `encodedSize` and `encoding` are set the size of the acquired data shall be checked against `encodedSize`. | ||
| 9. If `encodedSize` is not set, but `originalSize` is: the size of the acquired data shall be checked against `originalSize`. |
There was a problem hiding this comment.
| 9. If `encodedSize` is not set, but `originalSize` is: the size of the acquired data shall be checked against `originalSize`. | |
| 9. If `encodedSize` is not set, but `originalSize` is, the size of the acquired data shall be checked against `originalSize`. |
Might match the previous sentence structure better
| 10. If `encoding` is set: the acquired data shall be decoded according to the algorithm indicated in `encoding`. | ||
| 11. If `encoding` and `originalSize` are set: the resulting decoded data shall be checked against `originalSize`. |
There was a problem hiding this comment.
| 10. If `encoding` is set: the acquired data shall be decoded according to the algorithm indicated in `encoding`. | |
| 11. If `encoding` and `originalSize` are set: the resulting decoded data shall be checked against `originalSize`. | |
| 10. If `encoding` is set, the acquired data shall be decoded according to the algorithm indicated in `encoding`. | |
| 11. If `encoding` and `originalSize` are set, the resulting decoded data shall be checked against `originalSize`. |
keszybz
left a comment
There was a problem hiding this comment.
I think that the primary problem with the current text is that it aims for reproducibility, but leaves a bunch of implementation choices undefined:
- sorting of entries
- sorting within an entry
- null or omitted fields
- whitespace formatting of json
| This format stores information about static, immutable data resources that potentially can be acquired over | ||
| the network or stored on a disk. It's in particular supposed to be able to reasonably represent the |
There was a problem hiding this comment.
For the first sentence, "static" and "immutable" are both bogus. We have no control over those files, so whether they are static and immutable is completely out of scope.
The mtree man page says "The mtree format is a textual format that describes a collection of filesystem objects". That seems much more to the point. Maybe "This document describes a textual file manifest format that describes a collection of blobs. Those blobs can potentially be acquired over the network or stored on a disk. In particular, it can represent the properties of files arranged in a POSIX file system tree as well as partitions in a GPT partition table."
| 2. It allows extracting data "slices" from data sources, via offset and range. | ||
| 3. Various fields of additional per-file metadata may be defined, including various fields for GPT partition table metadata. | ||
| 4. It's an extensible format permitting vendors and projects to add their own per-file and per-manifest fields. | ||
| 5. Inline cryptographic signing. |
There was a problem hiding this comment.
1–4 are sentences with a subject, 5 should have one too.
| This format is designed with tools such as `mkosi` and `systemd-sysupdate` in mind, but it's intended to be | ||
| generally useful as a way to describe collections of files that may potentially be acquired over the network | ||
| or from local storage, and verified cryptographically. |
There was a problem hiding this comment.
This repeats what is in the intro.
| `mediaType` field set (which identifies the sequence as a UAPI.16 manifest). File objects for directory | ||
| entries `Uapi16Manifest` (or any filename with a prefix of `Uapi16Manifest.`), `.`, `..` or any path ending | ||
| in `/.`, `/..` are not permitted. |
There was a problem hiding this comment.
This second part should be a separate paragraph, since it describes a completely separate topic from the first part.
| When applied to a GPT partition table the first object may also describe fields in the overall GPT | ||
| partition table header of the disk. |
There was a problem hiding this comment.
Hmm, so the preceding para says that the first object must describe "a … directory". But then this sentence here contradicts that, and says that it may describe a GPT header.
There was a problem hiding this comment.
I think this should be reworded to say that the first object identifies the file format as a whole and may be describe either a directory or a GPT partition header.
|
|
||
| The `name` must be a valid UTF-8 POSIX path (i.e. using the slash `/` as component separator), may not | ||
| contain control characters (ASCII 0…31, 127) and may not contain "`.`", "`..`", "``" as components of the | ||
| path. It may not start nor end with a slash, nor may it have multiple slashes in immediate sequence (or in other |
There was a problem hiding this comment.
"or" (There are two valid forms: "It may neither start nor end with a slash" and "It may not start or end with a slash".)
| is the size of a UAPI.16 file manifest generated for the subdirectory (which only applies if the directory | ||
| contents are not provided in-line, but via a reference to a separate sub-manifest file, see above). | ||
|
|
||
| The `major` (unsigned integer) field only applies to file objects of type `blk` and `chr`, it encodes the |
There was a problem hiding this comment.
"only applies to" → "can only be specified for object of type …
| with either decoding vocabulary. | ||
|
|
||
| `encoding` specifies the encoding of the specified resource. It accepts the same encoding specifiers as | ||
| HTTP's `Content-Encoding:` field, i.e. `gzip` or `zstd`. |
| 2. If `validAfterUSec` is set and greater than the current time, the download should immediately fail. | ||
| 3. If `validBeforeUSec` is set and lower than the current time, the download should immediately fail. |
There was a problem hiding this comment.
Hmm, why not "since" and "until"? "after" and "before" as described here doesn't match the description: "after"/"before" implies that the boundaries are excluded, while the description specifies that they are included in the valid range. In practice this doesn't matter much, but is inelegant.
| directory object), and must have the value `"application/vnd.uapi.16.manifest"`. It should not be used on | ||
| any other file object in the sequence. | ||
|
|
||
| If `name` is not specified the record stores information about the top-level root file object. This file |
There was a problem hiding this comment.
This says that name is first, but I don't see the order of other entries described anywhere.
Foxboron
left a comment
There was a problem hiding this comment.
LGTM. But generally I think there is an over-reliance on paranthesis for what should be properly part of the paragraphs.
| immediately following the file object of the subdirectory object. (Even if ordering within directories is | ||
| not mandated, it is definitely a good idea to sort the entries alphabetically, or by version | ||
| sort. Alternatively, historical ordering – i.e. add new additions purely at the end – might be a good | ||
| idea. Sorting entries ensures the manifests become reproducible as long as the same tool is used, which is | ||
| definitely a welcome property.) |
There was a problem hiding this comment.
The parenthesis here encompassess too much text to be a note/sidenote thing. So preferably remove them or structure this another way?
| idea. Sorting entries ensures the manifests become reproducible as long as the same tool is used, which is | ||
| definitely a welcome property.) | ||
|
|
||
| The `name` fields of file objects must be uniquely assigned within a manifest file. |
There was a problem hiding this comment.
I think we can have singular here instead of plural? This harmonizes better with the next paragraph IMO.
| without file objects of type `dir`), subdirectory listings may either be specified inline (typically | ||
| preferable) or be imported via a separate "sub-manifest" file covering just this subdirectory (and potential | ||
| sub-subdirectories, …), by referencing the separate UAPI.16 manifest file in the `contents` field of the file | ||
| object. |
There was a problem hiding this comment.
I think all the paranthesis sectionts here can just be replaced by commas or rewritten to not require the paranthesis.
Something liiiikkeee this maybe?
For manifests convering a direcotry tree with nested directories, as opposed to a flat direcotry listing without file
objects of type `dir`, it's preferably to specify the subdirectory listing inline or be imported bia a seperate
"sub-manifest" file covering just this subdirectory, by referencing the seperate UAPI.16 manifest file in the `contets`
field of the file object.
|
@keszybz The reproducible builds aspect is a good catch. The file listing should be ordered in some declared way. Probably alphabetical order? |
No description provided.