Add support for opening structured dtypes as void for `zarr` driver #272

BrianMichell · 2026-01-05T19:02:10Z

Supersedes #264

Removes bool open_as_void = false default in C++ function declarations and definitions.

Uses a derived DataCache implementation similar to strategy used in neuroglancer_precomputed driver for UnshardedDataCache and ShardedDataCache

laramiel · 2026-01-06T02:14:10Z

Ok, I imported this locally and am looking at a few things.

Please make sure that all the tests build and pass. By removing the open_as_void = true, the spec tests fail to build.

tensorstore/driver/zarr/driver.cc

…ussion_r2663351949` and extend test coverage

…ussion_r2663353574` and extend test coverage

BrianMichell · 2026-01-06T15:42:55Z

Ok, I imported this locally and am looking at a few things.

Please make sure that all the tests build and pass. By removing the open_as_void = true, the spec tests fail to build.

My bad, I don't know how I missed the compilation issue for the spec test. I've fixed the issue and resolved your comments.

Thanks for taking a look and getting back to this so fast!

tensorstore/driver/zarr/schema.yml

tensorstore/driver/zarr/driver.cc

…ges/BASE..a42b6f511375dc1bd402ad525519f57210546735#r2666111665` Enforce schema validation for one-of `field` and `open_as_void`

…ussion_r2666195572` Use synthesized open_as_void field

tensorstore/driver/zarr/driver.cc

tensorstore/driver/zarr/spec.cc

tensorstore/driver/zarr/driver.cc

tensorstore/driver/zarr/schema.yml

…ges/BASE..7fb91d751f24a1dd479fd389378f9e6ad33c8112#r2669500148` Make the schema valid

…223557` Compose ZarrMetadata with `open_as_void` field

…205910` Add error check for creating a structured Store but not providing the dtype

…216841` Simplify field index logic for `open_as_void` case

BrianMichell · 2026-01-12T19:54:00Z

Sorry for the delay responding to the latest round of feedback. I believe everything raised so far has been addressed.

tensorstore/driver/zarr/spec.h

tensorstore/driver/zarr/driver.cc

tensorstore/driver/zarr/dtype.h

jbms · 2026-01-13T00:01:19Z

tensorstore/driver/zarr/driver.cc

+  // since we're treating the data as raw bytes regardless of the actual dtype.
+  // Shape is allowed to differ (handled by base class for resizing).
+  // Other fields like compressor, order, chunks must still match.
+  if (existing_metadata.dtype.bytes_per_outer_element !=


I think we could just rely on the normal validate but applied to the void metadata.

tensorstore/driver/zarr/driver.cc

tensorstore/driver/zarr/metadata.cc

tensorstore/driver/zarr/dtype.cc

…684301114` and `https://github.com/google/tensorstore/pull/272#discussion_r2684315692` Address fill value issues.

…684331789` and `https://github.com/google/tensorstore/pull/272#discussion_r2684406823` Guard cache with `absl::call_once` for thread safety

…683758437` Document and enforce `selected_field` and `open_as_void` exclusivity

…683807341` and `https://github.com/google/tensorstore/pull/272#discussion_r2683823803` IWYU

…684312496` Validate schema against `open_as_void` metadata but `partial_metadata` against regular metadata

…684307471` Allow original metadata to cache pointer on first access

…684302699` Rely on normal metadata validation for void metadata

laramiel · 2026-01-16T18:10:43Z

I think that this is getting pretty close. I have a few minor edits in my import which I can just add in. Getting jeremy to look over it. Thanks.

Edits:

some include changes. The riegeli includes are no longer needed in driver.cc.
rename VoidMetadataCache -> LazyVoidMetadata. Same for VoidFieldCache -> LazyVoidField.
If the Lazy... fields are mutable, the internals do not need to be.
convert Lazy... into class with private members & friend to the enclosing struct.

Matches the pattern from zarr v2 driver (PR google#272). When both "field" and "open_as_void" are specified in the spec, return an error since these options are mutually exclusive - field selects a specific field from a structured array, while open_as_void provides raw byte access to the entire structure.

The zarr3 URL syntax cannot represent field selection or void access mode. Following the pattern from zarr v2 driver (PR google#272), ToUrl() now returns an error when either of these options is specified instead of silently ignoring them.

…trip Following the pattern from zarr v2 driver (PR google#272), override GetBoundSpecData in ZarrDataCache to set spec.open_as_void from ChunkCacheImpl::open_as_void_. This ensures that when you open a store with open_as_void=true and then call spec(), the resulting spec correctly has open_as_void=true set. Without this fix, opening a store with open_as_void=true and then getting its spec would lose the open_as_void flag, causing incorrect behavior if the spec is used to re-open the store.

Add comprehensive tests for open_as_void functionality following the patterns from zarr v2 driver (PR google#272): Tests that PASS: - OpenAsVoidSimpleType: Verifies simple type arrays can be opened with open_as_void, gaining an extra dimension for bytes - OpenAsVoidSpecRoundtrip: Verifies open_as_void preserved in spec JSON - OpenAsVoidGetBoundSpecData: Verifies spec() on void store returns open_as_void=true (tests the GetBoundSpecData fix) - OpenAsVoidCannotUseWithField: Verifies mutual exclusivity validation - OpenAsVoidUrlNotSupported: Verifies ToUrl() rejects open_as_void - FieldSelectionUrlNotSupported: Verifies ToUrl() rejects selected_field Tests marked TODO (pending codec chain implementation): - OpenAsVoidStructuredType - OpenAsVoidWithCompression - OpenAsVoidReadWrite - OpenAsVoidWriteRoundtrip Also fixes BUILD file: adds :metadata dependency to :chunk_cache target to provide the dtype.h header that chunk_cache.h includes.

The codec chain is prepared for the original dtype and chunk shape (without the extra bytes dimension). For void access: DecodeChunk: - Strip the bytes dimension from grid's chunk_shape to get original shape - Decode using the original codec shape - Reinterpret the decoded bytes as [chunk_shape..., bytes_per_elem] EncodeChunk: - Input has shape [chunk_shape..., bytes_per_elem] of byte_t - Create a view with the original chunk shape and element_size - Encode using the original codec This follows the pattern from zarr v2 (PR google#272) where the void metadata has the chunk_layout computed to match encoded/decoded layouts.

For void access, the codec handling differs between: - Non-structured types: codec prepared for [chunk_shape] with original dtype Need to decode/encode then reinterpret bytes. - Structured types: codec already prepared for [chunk_shape, bytes_per_elem] with byte dtype. Just decode/encode directly. Add original_is_structured parameter to cache constructors to properly distinguish these cases in DecodeChunk and EncodeChunk. This follows the pattern from zarr v2 (PR google#272) where CreateVoidMetadata() creates a modified metadata for void access.

Apply changes based on feedback from google#272

BrianMichell added 3 commits January 5, 2026 16:08

Add open_as_void option to zarr v2 driver (#6)

59a6871

Remove default open_as_void from definitions

2aedabf

Use derived DataCache for open_as_void

46d9902

BrianMichell mentioned this pull request Jan 5, 2026

Add support for structured dtypes to zarr3 driver, open zarr 2 and 3 structs as void #264

Closed

laramiel reviewed Jan 6, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Outdated Show resolved Hide resolved

laramiel reviewed Jan 6, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Show resolved Hide resolved

laramiel reviewed Jan 6, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Outdated Show resolved Hide resolved

BrianMichell added 5 commits January 6, 2026 14:36

Fix compile issues for missing argument

ccc4bd7

Correct tests, add argument comment for open as void value

5d4a68f

Add test coverage for GetSpecInfo

c410f5e

Resolve feedback https://github.com/google/tensorstore/pull/272#disc…

d886c2f

…ussion_r2663351949` and extend test coverage

Resolve feedback https://github.com/google/tensorstore/pull/272#disc…

a42b6f5

…ussion_r2663353574` and extend test coverage

laramiel reviewed Jan 6, 2026

View reviewed changes

tensorstore/driver/zarr/schema.yml Show resolved Hide resolved

laramiel reviewed Jan 6, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Outdated Show resolved Hide resolved

BrianMichell added 2 commits January 7, 2026 14:57

Resolve feedback https://github.com/google/tensorstore/pull/272/chan…

e9c15da

…ges/BASE..a42b6f511375dc1bd402ad525519f57210546735#r2666111665` Enforce schema validation for one-of `field` and `open_as_void`

Resolve feedback https://github.com/google/tensorstore/pull/272#disc…

7fb91d7

…ussion_r2666195572` Use synthesized open_as_void field

laramiel reviewed Jan 7, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Show resolved Hide resolved

laramiel reviewed Jan 7, 2026

View reviewed changes

tensorstore/driver/zarr/spec.cc Show resolved Hide resolved

laramiel reviewed Jan 7, 2026

View reviewed changes

tensorstore/driver/zarr/spec.cc Outdated Show resolved Hide resolved

laramiel reviewed Jan 7, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Show resolved Hide resolved

jbms reviewed Jan 7, 2026

View reviewed changes

tensorstore/driver/zarr/schema.yml Outdated Show resolved Hide resolved

BrianMichell added 4 commits January 7, 2026 21:00

Resolve feedback https://github.com/google/tensorstore/pull/272/chan…

389d6a9

…ges/BASE..7fb91d751f24a1dd479fd389378f9e6ad33c8112#r2669500148` Make the schema valid

Resolve https://github.com/google/tensorstore/pull/272/changes#r2669…

101011b

…223557` Compose ZarrMetadata with `open_as_void` field

Resolve https://github.com/google/tensorstore/pull/272/changes#r2669…

62fd8f9

…205910` Add error check for creating a structured Store but not providing the dtype

Resolve https://github.com/google/tensorstore/pull/272/changes#r2669…

9735318

…216841` Simplify field index logic for `open_as_void` case

laramiel reviewed Jan 12, 2026

View reviewed changes

tensorstore/driver/zarr/spec.h Outdated Show resolved Hide resolved

laramiel reviewed Jan 12, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Outdated Show resolved Hide resolved

laramiel reviewed Jan 12, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Show resolved Hide resolved

laramiel reviewed Jan 12, 2026

View reviewed changes

tensorstore/driver/zarr/dtype.h Show resolved Hide resolved

laramiel reviewed Jan 12, 2026

View reviewed changes

tensorstore/driver/zarr/dtype.h Outdated Show resolved Hide resolved

jbms reviewed Jan 13, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Outdated Show resolved Hide resolved

jbms reviewed Jan 13, 2026

View reviewed changes

tensorstore/driver/zarr/driver.cc Outdated Show resolved Hide resolved

jbms reviewed Jan 13, 2026

View reviewed changes

tensorstore/driver/zarr/metadata.cc Show resolved Hide resolved

jbms reviewed Jan 13, 2026

View reviewed changes

tensorstore/driver/zarr/dtype.cc Outdated Show resolved Hide resolved

BrianMichell added 7 commits January 13, 2026 16:08

Resolve https://github.com/google/tensorstore/pull/272#discussion_r2…

a0efd69

…684301114` and `https://github.com/google/tensorstore/pull/272#discussion_r2684315692` Address fill value issues.

Resolve https://github.com/google/tensorstore/pull/272#discussion_r2…

5775f0c

…684331789` and `https://github.com/google/tensorstore/pull/272#discussion_r2684406823` Guard cache with `absl::call_once` for thread safety

Resolve https://github.com/google/tensorstore/pull/272#discussion_r2…

eb169a0

…683758437` Document and enforce `selected_field` and `open_as_void` exclusivity

Resolve https://github.com/google/tensorstore/pull/272#discussion_r2…

c3fb8c0

…683807341` and `https://github.com/google/tensorstore/pull/272#discussion_r2683823803` IWYU

Resolve https://github.com/google/tensorstore/pull/272#discussion_r2…

23bff85

…684312496` Validate schema against `open_as_void` metadata but `partial_metadata` against regular metadata

Resolve https://github.com/google/tensorstore/pull/272#discussion_r2…

7d187e5

…684307471` Allow original metadata to cache pointer on first access

Resolve https://github.com/google/tensorstore/pull/272#discussion_r2…

5b90443

…684302699` Rely on normal metadata validation for void metadata

copybara-service bot merged commit b4f899d into google:master Jan 22, 2026
1 check passed

BrianMichell added a commit to BrianMichell/tensorstore that referenced this pull request Jan 26, 2026

Merge pull request #7 from BrianMichell/v3_open_as_void_validation

e197bc6

Apply changes based on feedback from google#272

Add support for opening structured dtypes as void for zarr driver #272

Add support for opening structured dtypes as void for zarr driver #272

Uh oh!

Conversation

BrianMichell commented Jan 5, 2026

Uh oh!

laramiel commented Jan 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BrianMichell commented Jan 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BrianMichell commented Jan 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbms Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

laramiel commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add support for opening structured dtypes as void for `zarr` driver #272

Add support for opening structured dtypes as void for `zarr` driver #272

laramiel commented Jan 16, 2026 •

edited

Loading