perf: Batch metadata query to avoid N+1 across 5 call sites#232
perf: Batch metadata query to avoid N+1 across 5 call sites#232KRRT7 wants to merge 8 commits intomicrosoft:mainfrom
Conversation
0fba38f to
f9a489b
Compare
Add add_terms_batch / add_properties_batch to the index interfaces with executemany-based SQLite implementations. Restructure add_metadata_to_index_from_list and add_to_property_index to collect all items first, then batch-insert via extend() and the new batch methods. Eliminates ~1000 individual INSERT round-trips during indexing.
Rename _collect_{facet,entity,action}_{terms,properties} to drop the
leading underscore in propindex.py and semrefindex.py.
Change list to Sequence in add_terms_batch and add_properties_batch interfaces and implementations to satisfy covariance. Add missing add_terms_batch to FakeTermIndex in conftest.py.
lookup_term_filtered called get_item() per scored ref — one SELECT and full deserialization per match. The filter only needs knowledge_type (a plain column) and range (json.loads of range_json), never the expensive knowledge_json deserialization (64% of per-row cost). Add get_metadata_multiple to ISemanticRefCollection that fetches only semref_id, range_json, knowledge_type in a single batch query. Replace the N+1 loop in lookup_term_filtered with one get_metadata_multiple call. Benchmark (200 matches, 200 rounds): 4.38ms → 1.32ms (3.3x speedup).
Apply the same get_metadata_multiple pattern from lookup_term_filtered to four more sites that called get_item() in a loop: - propindex.lookup_property_in_property_index: filter by .range - SemanticRefAccumulator.group_matches_by_type: group by .knowledge_type - SemanticRefAccumulator.get_matches_in_scope: filter by .range - answers.get_scored_semantic_refs_from_ordinals_iter: two-phase metadata filter then batch get_multiple for matching full objects All sites now use a single batch query instead of N individual SELECTs, skipping knowledge_json deserialization where only range or knowledge_type is needed.
…arisons - Use bisect_right with key=start in TextRangeCollection.contains_range to skip O(n) linear scan (O(log n) for non-overlapping point ranges) - Replace TextLocation allocations in TextRange __eq__/__lt__/__contains__ with a shared _effective_end returning tuples - Skip pydantic validation in get_metadata_multiple by constructing TextLocation/TextRange directly from JSON
f9a489b to
743d8db
Compare
| assert set(rowdict) == set(arg) | ||
| return [self._deserialize_semantic_ref_from_row(rowdict[ordl]) for ordl in arg] | ||
|
|
||
| async def get_metadata_multiple( |
There was a problem hiding this comment.
Missing bounds check in get_metadata_multiple
get_multiple validates ordinals are in bounds before querying. get_metadata_multiple in collections.py:344 does not — if an ordinal is missing from the DB, rowdict[o] will KeyError. Should either validate or handle missing keys gracefully, mirroring the existing get_multiple pattern.
| semantic_ref_ordinal: SemanticRefOrdinal | ScoredSemanticRefOrdinal, | ||
| ) -> None: ... | ||
|
|
||
| async def add_properties_batch( |
There was a problem hiding this comment.
In propindex.py:85 the SQLite implementation imports make_property_term_text / split_property_term_text from storage.memory.propindex. This creates a cross-layer dependency (sqlite → memory). These helpers should be extracted to a shared location (e.g., the base knowpro/propindex or a utils module).
| await add_to_property_index(conversation, 0) | ||
|
|
||
|
|
||
| def collect_facet_properties( |
There was a problem hiding this comment.
collect_* functions duplicate logic from existing add_*_to_index functions
propindex.py adds collect_entity_properties, collect_action_properties, collect_facet_properties that duplicate the logic of the existing add_entity_properties_to_index, add_action_properties_to_index, add_facet_to_index. Same duplication occurs in semrefindex.py with collect_entity_terms / collect_action_terms / collect_facet_terms. The old add_* functions should be refactored to use the new collect_* functions + batch add, rather than keeping both versions.
| async def test_benchmark_lookup_term_filtered(async_benchmark): | ||
| """Benchmark lookup_term_filtered with N+1 get_item pattern.""" | ||
| settings = make_settings() | ||
| tmpdir = tempfile.mkdtemp() |
There was a problem hiding this comment.
the temp directory is created manually and only cleaned up in finally inside the benchmark — if create_indexed_transcript raises, the cleanup won't run. Use a pytest tmp_path fixture instead.
Stack: 4/4 — depends on #230. Merge #231, #229, #230, then this PR.
get_item()per scored ref — one SELECT and full deserialization per match (N+1 pattern)get_metadata_multipletoISemanticRefCollectionthat fetches onlysemref_id, range_json, knowledge_typein a single batch queryget_metadata_multiplecall at each sitecontains_range, inline tuple comparisons inTextRange, skip pydantic validation inget_metadata_multipleCall sites optimized
lookup_term_filtered— batch metadata, filter by knowledge_type/rangelookup_property_in_property_index— batch metadata, filter by range scopeSemanticRefAccumulator.group_matches_by_type— batch metadata, group by knowledge_typeSemanticRefAccumulator.get_matches_in_scope— batch metadata, filter by range scopeget_scored_semantic_refs_from_ordinals_iter— two-phase: metadata filter then batch fetchAdditional optimizations
TextRangeCollection.contains_range: replaced O(n) linear scan withbisect_rightkeyed onstart, reducing scope-filtering from ~25ms to ~9msTextRange: replacedTextLocationallocations in__eq__/__lt__/__contains__with a shared_effective_endreturning tuplesget_metadata_multiple: constructTextLocation/TextRangedirectly from JSON instead of going through__pydantic_validator__Benchmark
Azure Standard_D2s_v5 — 2 vCPU, 8 GiB RAM, Python 3.13
Query (pytest-async-benchmark pedantic, 200 rounds)
200 matches against a 200-message indexed SQLite transcript. Only the function under test is timed.
lookup_term_filteredgroup_matches_by_typeget_scored_semantic_refs_from_ordinals_iterlookup_property_in_property_indexget_matches_in_scopeReproduce the benchmark locally
pip install 'pytest-async-benchmark @ git+https://github.com/KRRT7/pytest-async-benchmark.git@feat/pedantic-mode' pytest-asyncio python -m pytest tests/benchmarks/test_benchmark_query.py -v -sGenerated by codeflash optimization agent