-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Optimize arrow bytes view map #20055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
51768bf
cff1f46
e3cd6d4
b9f7d68
52b6f78
32bd8b1
49de0e9
d2929e1
5996423
9c06c58
9302c3b
c9be2af
0fc43a4
6a73668
102855a
f105b53
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -291,8 +291,8 @@ where | |||||||||||||
|
|
||||||||||||||
| // Extract length from the view (first 4 bytes of u128 in little-endian) | ||||||||||||||
| let len = view_u128 as u32; | ||||||||||||||
|
|
||||||||||||||
| // Check if value already exists | ||||||||||||||
| let mut value: Option<&[u8]> = None; | ||||||||||||||
| let maybe_payload = { | ||||||||||||||
| // Borrow completed and in_progress for comparison | ||||||||||||||
| let completed = &self.completed; | ||||||||||||||
|
|
@@ -328,8 +328,9 @@ where | |||||||||||||
| } else { | ||||||||||||||
| &in_progress[offset..offset + stored_len] | ||||||||||||||
|
Comment on lines
328
to
329
|
||||||||||||||
| } else { | |
| &in_progress[offset..offset + stored_len] | |
| } else if buffer_index == completed.len() { | |
| &in_progress[offset..offset + stored_len] | |
| } else { | |
| panic!("Invalid buffer_index: {}", buffer_index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still accesses it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you’re absolutely right — this still ends up calling values.value(i) in the insert path.
That contradicts the goal of fetching the value once per row. I’ll rework this to eagerly bind the slice to a local variable and thread that through both the lookup and insert branches so values.value(i) is only accessed a single time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems not right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch — you’re right to flag this.
The intent here was to defer materializing the byte slice until we know it’s needed, but the current structure makes that unclear and introduces unnecessary indirection.
I’ll refactor this so the slice is explicitly materialized once (at the top of the loop) and then reused consistently for both lookup and insertion paths. That should make the logic clearer and align better with the original optimization goal.