perf: remove dtype + fill val handling per chunk #124

ilan-gold · 2025-11-03T14:49:37Z

Calling to_native_dtype + __str__ came up as one of the only python-CPU-bound things when doing some benchmarking. My use-case is quite contrived (generating thousands of WithSubset objects) but I think it's probably worth investigating getting rid of these calls. Some observations:

1. I wonder if all getting the dtype and fill_val be wrapped up in just relying on https://docs.rs/zarrs/latest/zarrs/array/struct.Array.html#method.open and then using the values directly (there are probably other benefits of doing this) but I think this is a separate PR So it turns out the array actually doesn't have to be created yet when the pipeline is generated. So I ended up doing something similar with the metadata (making it an actual Struct and then working with that to get fill and dtype).
2. Regardless, most of this refactor is around removing Basic anyway so that chunk handling is independent of the ability. I noticed that ChunkRepresentation requires ownership over its arguments which means we copy per-chunk. Not sure what would go into making that a reference, but it's no worse than the previous situation where I think we were generating copies repeatedly, but from PyO3 calling python --> Hooray! No longer!

The benefit wasn't crazy ~5% but I think going in this direction is good (see point 1)

This reverts commit 95f2886.

src/lib.rs

tests/test_v2.py

src/chunk_item.rs

src/lib.rs

tests/test_v2.py

`UserWarning: Array is unsupported by ZarrsCodecPipeline: incompatible fill value metadata: dtype=|V7, fill_value=null`

ilan-gold · 2026-02-02T10:30:22Z

@LDeakin Up to you if we wait for 0.23.1. Thanks for the help here!

This reverts commit 86b8118.

…dling

ilan-gold · 2026-02-02T12:11:02Z

Locally I'm getting "DataTypeMetadataV2 doesn't implement std::fmt::Display" which seems incorrect based on https://github.com/zarrs/zarrs/blob/0b36a73f861bc8189ba4cde315630d828db3de80/zarrs_metadata/src/v2/array.rs#L190. Sorry about that

LDeakin · 2026-02-02T12:12:34Z

Locally I'm getting "DataTypeMetadataV2 doesn't implement std::fmt::Display" which seems incorrect based on https://github.com/zarrs/zarrs/blob/0b36a73f861bc8189ba4cde315630d828db3de80/zarrs_metadata/src/v2/array.rs#L190. Sorry about that

Just run a cargo update and you'll get the latest release of zarrs_metadata

LDeakin

Nice one Ilan! I might rerun some perf benchmarks with this after we release

ilan-gold · 2026-02-02T12:16:36Z

Just run a cargo update and you'll get the latest release of zarrs_metadata

Tried that out but oddly enough didn't work. I'm cycling through the other options available but at least I see this is a local-only problem now, so I'll handle it.

(feat): first pass remove dtype + fill val handling per chunk

697eb52

ilan-gold marked this pull request as draft November 3, 2025 14:49

LDeakin and others added 7 commits December 27, 2025 01:20

chore: bump zarrs to 0.23.0-beta.1

1ad5cef

chore: bump zarrs to 0.23.0-beta.2

57e2e8f

chore: incr to 0.2.2-dev

95f2886

chore: minimise diff

7f9b244

chore: bump zarrs to 0.23.0-beta.3

10b1f4c

Merge branch 'main' into ig/refactor_chunk_handling

ea9c3e5

feat: upgrade zarr v3

9c43e81

LDeakin mentioned this pull request Jan 1, 2026

Upgrade to zarrs 0.23 #135

Open

LDeakin and others added 6 commits January 3, 2026 20:28

Revert "chore: incr to 0.2.2-dev"

c6a5839

This reverts commit 95f2886.

fix: unsupported data type tests

3cbe4be

fix: give a real title to zarr store

50c3560

fix: don't pass in any metadata

faf922b

Merge branch 'ld/zarrs_0.23.0' into ig/refactor_chunk_handling

fc8c057

fix: warning

c922dd0

ilan-gold changed the base branch from main to ld/zarrs_0.23.0 January 3, 2026 19:08

ilan-gold added 2 commits January 3, 2026 21:13

fix: cleanups

66096a5

chore: small cleanups

dc5e60d

ilan-gold changed the title ~~(feat): remove dtype + fill val handling per chunk~~ perf: remove dtype + fill val handling per chunk Jan 5, 2026

chore: use is_whole_chunk more

613033f

ilan-gold commented Jan 5, 2026

View reviewed changes

src/lib.rs Show resolved Hide resolved

ilan-gold marked this pull request as ready for review January 5, 2026 11:28

flying-sheep reviewed Jan 5, 2026

View reviewed changes

tests/test_v2.py Outdated Show resolved Hide resolved

flying-sheep reviewed Jan 5, 2026

View reviewed changes

src/chunk_item.rs Outdated Show resolved Hide resolved

chore: bump zarrs to 0.23.0-beta.4

d25b2f9

ilan-gold added the performance label Jan 9, 2026

ilan-gold and others added 3 commits January 9, 2026 11:36

Merge branch 'ld/zarrs_0.23.0' into ig/refactor_chunk_handling

53b3af6

chore: bump zarrs to 0.23.0-beta.5

5fe3132

chore: bump zarrs to 0.23.0-beta.6

efd4c38

LDeakin and others added 6 commits February 2, 2026 02:29

chore: bump zarrs to 0.23.0

5dfa55d

Merge remote-tracking branch 'origin/main' into ld/zarrs_0.23.0

2db97c2

fix: use map_py_err in WithSubset::new

f1ee1b7

Merge branch 'ld/zarrs_0.23.0' into ig/refactor_chunk_handling

ea833f9

rename: WithSubset

536f5dc

run on ci while waiting for rustfmt to install

4eca1b8

ilan-gold force-pushed the ig/refactor_chunk_handling branch from c7fe896 to 4eca1b8 Compare February 1, 2026 16:53

ilan-gold added 5 commits February 1, 2026 17:54

fix: import

e262eca

remove unused import

6101bd3

fix: no fill warning

be5b36a

key/shape

5c279d9

fix: pyi

6c8c7a5

ilan-gold commented Feb 1, 2026

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

tests/test_v2.py Show resolved Hide resolved

remove old ValueError

4c0544f

Base automatically changed from ld/zarrs_0.23.0 to main February 1, 2026 22:12

LDeakin and others added 3 commits February 2, 2026 15:09

feat: improve data type / fill value incompatibility error

de1b2e3

`UserWarning: Array is unsupported by ZarrsCodecPipeline: incompatible fill value metadata: dtype=|V7, fill_value=null`

merge

68e0881

v2

86b8118

ilan-gold requested a review from LDeakin February 2, 2026 10:30

LDeakin added 2 commits February 2, 2026 22:52

Revert "v2"

3500c61

This reverts commit 86b8118.

Merge remote-tracking branch 'origin/main' into ig/refactor_chunk_han…

0812f0a

…dling

LDeakin approved these changes Feb 2, 2026

View reviewed changes

ilan-gold merged commit 48ffaa4 into main Feb 2, 2026
17 checks passed

ilan-gold deleted the ig/refactor_chunk_handling branch February 2, 2026 12:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: remove dtype + fill val handling per chunk #124

perf: remove dtype + fill val handling per chunk #124

ilan-gold commented Nov 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ilan-gold commented Feb 2, 2026 •

edited

Loading

Uh oh!

ilan-gold commented Feb 2, 2026

Uh oh!

LDeakin commented Feb 2, 2026

Uh oh!

LDeakin left a comment

Uh oh!

ilan-gold commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

perf: remove dtype + fill val handling per chunk #124

perf: remove dtype + fill val handling per chunk #124

Conversation

ilan-gold commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ilan-gold commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilan-gold commented Feb 2, 2026

Uh oh!

LDeakin commented Feb 2, 2026

Uh oh!

LDeakin left a comment

Choose a reason for hiding this comment

Uh oh!

ilan-gold commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ilan-gold commented Nov 3, 2025 •

edited

Loading

ilan-gold commented Feb 2, 2026 •

edited

Loading