feat: support canonical Bitmap index segments#6869
Conversation
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
westonpace
left a comment
There was a problem hiding this comment.
I have some questions, I probably just need to get up to speed on these concepts.
| return self._ds.create_index( | ||
| [column], index_type, name, replace, train, None, kwargs | ||
| ) |
There was a problem hiding this comment.
Should you document what is being returned?
| IndexConfig(index_type="bitmap", parameters={}) | ||
| if isinstance(index_type, str) and index_type.upper() == "BITMAP" |
There was a problem hiding this comment.
I'm a little confused. If index_type == "BITMAP" you are switching to IndexConfig here (with no parameters) and then calling create_scalar_index but in create_scalar_index you also look for index_type == "BITMAP" but how would that ever be true?
| let mut frag_ids = RoaringBitmap::new(); | ||
| for key in self.index_map.keys() { | ||
| let bitmap = self.load_bitmap(key, None).await?; | ||
| frag_ids.extend(bitmap.iter().map(|(fragment_id, _)| *fragment_id)); | ||
| } | ||
| frag_ids.extend(self.null_map.iter().map(|(fragment_id, _)| *fragment_id)); | ||
| Ok(frag_ids) |
There was a problem hiding this comment.
Why implement this method? I thought it was only a utility method for an old migration before we had fragment bitmaps?
| raise NotImplementedError( | ||
| "Scalar indices currently only support a single column" | ||
| ) | ||
| return self.create_scalar_index( |
There was a problem hiding this comment.
The method is called create_index_uncommitted but if you call create_scalar_index isn't it going to commit the index?
Implements canonical Bitmap segment builds for
create_index_uncommitted(fragment_ids=...)by using the normal bitmap layout when noshard_idis requested, while preserving the legacyshard_id/part_*path.Adds committed multi-segment Bitmap query support so Python users can build segments, commit them, inspect them through
describe_indices(), and scan through one logical scalar index.Relates to OSS-971 and OSS-972.