Add query EXPLAIN capture to the database adapters#883
Conversation
- Add rowsReturned/executionTime to the normalized plan, measured from the real find/count/sum the explain scope already runs (no extra EXPLAIN pass) - Report a precise per-adapter engine label (mysql/mariadb/postgres/sqlite) via SQL::getExplainEngine() instead of a generic 'sql' - Fix Postgres plan extraction to recurse into child Plans so pgvector index scans under a Limit are reported (indexUsed/rowsScanned) - Capture actual nReturned/executionTimeMillis from Mongo executionStats - Document the display-only identifier substring rewrite caveat
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds explain-capture lifecycle and sanitization to Adapter, implements engine-specific explain execution (SQL, Postgres, MariaDB, MySQL, SQLite, Mongo), wires capture into find/count/sum and Pool, exposes Database::withExplain(), and adds unit and e2e tests. ChangesQuery Explain Capture System
Sequence Diagram(s)sequenceDiagram
participant Client
participant Database
participant Adapter
participant Engine
Client->>Database: withExplain(callback)
Database->>Adapter: startExplainCapture()
Adapter->>Adapter: init $explainBuffer
Client->>Adapter: find/count/sum (captured)
Adapter->>Adapter: capturePlan(payload)
Adapter->>Engine: EXPLAIN / explain command
Engine-->>Adapter: explain result
Adapter->>Adapter: sanitizePlan()
Adapter->>Adapter: buffer.push(plan)
Adapter->>Adapter: execute real query
Adapter->>Adapter: recordPlanActuals(rows, time)
Adapter->>Adapter: buffer[last].actuals set
Database->>Adapter: stopExplainCapture()
Adapter-->>Database: captured queries
Database-->>Client: Document(queries)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds an opt-in scoped explain-capture feature to all supported adapters (MySQL, MariaDB, Postgres, SQLite, MongoDB). Callers wrap reads in
Confidence Score: 5/5Safe to merge; all three previously-flagged issues (return-value discarded, Pool re-entrancy throw, MongoDB double-execution) have been correctly addressed in this revision. The implementation is well-structured and the common non-explain path is protected by a single null check. The Pool delegation correctly guards against re-entrant capture starts, MongoDB switched from executionStats to queryPlanner avoiding double query execution, and withExplain now properly returns the callback result. The only remaining item is the SQLite indexUsed field always being null despite available data in the plan steps, which is a minor quality gap rather than a correctness issue. No files require special attention. Important Files Changed
Reviews (9): Last reviewed commit: "Sanitize embedded perms/metadata table t..." | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/unit/ExplainTest.php (1)
98-116: ⚡ Quick winExercise the “last entry” contract with more than one buffered plan.
This fixture only has one buffered query, so it still passes if
recordPlanActuals()updates the first entry or every entry. Seed two plans and assert only the tail entry getsrowsReturned/executionTime.Suggested test tightening
- $buffer->setValue($adapter, [[ - 'purpose' => 'find', - 'context' => ['collection' => 'movies'], - 'plan' => ['engine' => 'mariadb', 'rowsScanned' => 10], - ]]); + $buffer->setValue($adapter, [ + [ + 'purpose' => 'find', + 'context' => ['collection' => 'movies'], + 'plan' => ['engine' => 'mariadb', 'rowsScanned' => 10], + ], + [ + 'purpose' => 'count', + 'context' => ['collection' => 'movies'], + 'plan' => ['engine' => 'mariadb', 'rowsScanned' => 3], + ], + ]); @@ - $this->assertSame(7, $captured[0]['plan']['rowsReturned']); - $this->assertSame(1.5, $captured[0]['plan']['executionTime']); + $this->assertArrayNotHasKey('rowsReturned', $captured[0]['plan']); + $this->assertArrayNotHasKey('executionTime', $captured[0]['plan']); + $this->assertSame(7, $captured[1]['plan']['rowsReturned']); + $this->assertSame(1.5, $captured[1]['plan']['executionTime']);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/ExplainTest.php` around lines 98 - 116, The test testRecordPlanActualsFillsLastEntry should seed Adapter::explainBuffer with two plan entries (instead of one) and then call Adapter::recordPlanActuals(...) to verify it only updates the tail entry: set two arrays into the ReflectionProperty('explainBuffer') for Adapter, invoke the ReflectionMethod('recordPlanActuals') with the rowsReturned and executionTime values, then assert that the first buffered plan's plan['rowsReturned'] and plan['executionTime'] remain unchanged while the second (last) plan's fields equal the provided values; this ensures recordPlanActuals updates only the last entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Adapter/Mongo.php`:
- Around line 3782-3799: PHPStan cannot prove mutation of $found through the
recursive anonymous closure $walk; replace the closure with a pure-returning
recursive routine so control flow is explicit: either convert the closure into a
named private/static method (e.g., findIndexNameInPlan(array $node): ?string) or
change $walk to accept/return ?string (return the discovered indexName rather
than mutating $found) and then set $found = $walk($tree); ensure references to
$walk, $found and $tree are updated accordingly so the function return type lets
PHPStan infer the mutation.
In `@src/Database/Adapter/Postgres.php`:
- Around line 1590-1602: The explain block currently prepares and runs the
statement directly via $this->getPDO()->prepare(...)->execute(), bypassing
Postgres::execute() and thus skipping statement_timeout and wrapped error
handling; replace the direct prepare/execute flow so the EXPLAIN SQL is executed
through the adapter's execution path (call Postgres::execute() or the adapter's
equivalent method) passing the same $explainSql and $binds (convert double
values with getFloatPrecision where needed before passing binds), then obtain
the result (e.g. fetchColumn from the returned statement/result) and close the
cursor via the adapter-returned statement; update references to $stmt,
getPDO()->prepare, ->execute(), ->fetchColumn() and ->closeCursor() accordingly
so the EXPLAIN inherits the same timeout and error behavior as normal
statements.
In `@tests/e2e/Adapter/Scopes/ExplainTests.php`:
- Around line 84-85: json_encode() can return false on failure, causing a type
error when passed to assertStringNotContainsString; update the test to ensure
encoding succeeded before asserting: call json_encode($entry['plan']['tree'])
into $rawTree, assert that $rawTree is not false (or use
$this->assertNotFalse(json_encode(...))) and/or check json_last_error() for
JSON_ERROR_NONE, then call $this->assertStringNotContainsString('_perms',
(string)$rawTree); reference json_encode(), $rawTree, $entry['plan']['tree'],
and assertStringNotContainsString when making the change.
---
Nitpick comments:
In `@tests/unit/ExplainTest.php`:
- Around line 98-116: The test testRecordPlanActualsFillsLastEntry should seed
Adapter::explainBuffer with two plan entries (instead of one) and then call
Adapter::recordPlanActuals(...) to verify it only updates the tail entry: set
two arrays into the ReflectionProperty('explainBuffer') for Adapter, invoke the
ReflectionMethod('recordPlanActuals') with the rowsReturned and executionTime
values, then assert that the first buffered plan's plan['rowsReturned'] and
plan['executionTime'] remain unchanged while the second (last) plan's fields
equal the provided values; this ensures recordPlanActuals updates only the last
entry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e29d09ca-e116-4d51-b618-cdbca58d2ab3
📒 Files selected for processing (12)
src/Database/Adapter.phpsrc/Database/Adapter/MariaDB.phpsrc/Database/Adapter/Mongo.phpsrc/Database/Adapter/MySQL.phpsrc/Database/Adapter/Pool.phpsrc/Database/Adapter/Postgres.phpsrc/Database/Adapter/SQL.phpsrc/Database/Adapter/SQLite.phpsrc/Database/Database.phptests/e2e/Adapter/Base.phptests/e2e/Adapter/Scopes/ExplainTests.phptests/unit/ExplainTest.php
- Mongo explainSQL: guard json_encode result against false before decode - Rewrite extractMongoIndexUsed as a returning recursion so the index walk has no dead null-comparison and is easier to follow - Guard json_encode in the explain e2e assertion - Document that withExplain returns the plan, not the callback result
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Adapter/Mongo.php`:
- Around line 3786-3794: The current traversal over $tree in
extractMongoIndexUsed() walks into all nested plans (including
queryPlanner.rejectedPlans) and can pick up IXSCAN stages from non-winning
branches; change the logic so it only follows the winning plan/execution path
instead of a blind foreach of $tree: locate the loop that iterates $tree and
instead walk the queryPlanner.winningPlan and the corresponding
executionStats.executionStages path (avoid traversing rejectedPlans or other
sibling branches) so that extractMongoIndexUsed() reports indexes only from the
winning plan (and therefore will not report IXSCAN when the winning plan is
COLLSCAN).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8a0bba38-80ad-448f-90d6-08cdf3e62c68
📒 Files selected for processing (3)
src/Database/Adapter/Mongo.phpsrc/Database/Database.phptests/e2e/Adapter/Scopes/ExplainTests.php
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/e2e/Adapter/Scopes/ExplainTests.php
- src/Database/Database.php
Drop the DTO field-table duplicated in the explainSQL docblock and the verbose recordPlanActuals/rename caveats; keep the design rationale and gotchas.
- withExplain now returns the callback's own result (matching withTenant/ withPreserveSequence); the plan is exposed via a by-ref out-param, so one call yields both rows and plan - Mongo explain uses queryPlanner verbosity instead of executionStats so it no longer executes every captured find/count/sum a second time; real rowsReturned/executionTime come from timing the actual read, as in SQL - Restrict Mongo index extraction to the winning plan so a rejected-plan IXSCAN is never reported when the query runs a COLLSCAN - Route the Postgres EXPLAIN through execute() so it inherits statement_timeout and the adapter's PDO error handling
…andling Align the base SQL and SQLite explainSQL cursor handling with the Postgres override: run the EXPLAIN via execute() so it inherits the timeout wrapper and processException mapping, and close the cursor in a finally.
Keep the result-processing inside the try and just add the finally for recordPlanActuals, instead of restructuring the method — minimises the diff.
Explain capture is wired in both SQL and Mongo, but the e2e tests gated on instanceof SQL with a stale 'only SQL' comment, so the engine-agnostic tests never ran on Mongo. Add getSupportForExplain() (false on base/Memory/Redis, true on SQL and Mongo, delegated by Pool) and gate the engine-agnostic tests on it. The find test keeps an instanceof SQL guard since it asserts the SQL-specific engine label and plan tree.
While the pool is capturing, a nested delegate() on the pinned transaction adapter — e.g. a before(find) transformation issuing its own query — re-called startExplainCapture() on the already-capturing adapter and threw 'cannot be nested', breaking withExplain + withTransaction for apps with such listeners. Only start (and own the stop/drain of) capture when the adapter isn't already capturing; nested calls accumulate into the same buffer. Adds a unit test that fails on the old behavior.
The sanitizer renamed standalone _perms/__metadata identifiers but missed them when embedded inside plan strings (e.g. a MariaDB attached_condition like "db_x_collection_y_perms.`_permission` = 'read'"), leaking the internal permission/metadata table names once the raw tree is exposed. Substring-rewrite those embedded physical-table tokens too.
What
Adds an opt-in query-plan ("explain") capability to the database adapters. A caller wraps a read in
Database::withExplain(callable); everyfind/count/sumissued inside the scope captures a normalized query plan, and the call returns aDocumentwith one entry per physical query.Design
A capture-buffer on
Adapter, not a standaloneexplain()method:startExplainCapture()/stopExplainCapture()/isExplainCapturing()toggle a nullable buffer (nesting throws).find/count/sumchecksexplainBuffer !== nulland callscapturePlan(), which runs an engine-native EXPLAIN and normalizes it.Pool::delegate()forwards the scope to the inner adapter and aggregates entries back, covering the pinned-transaction and pooled paths.The common (capture-off) path pays only a single null check.
Normalized plan
Every adapter returns the same fixed shape so a consumer DTO can stay typed:
engine,rowsScanned,indexUsed,estimatedCost,rowsReturned,executionTime, and the rawtree.rowsReturned,executionTime) are measured from the read that already runs inside the explain scope — no second EXPLAIN ANALYZE pass.mysql/mariadb/postgres/sqlite/mongo) viaSQL::getExplainEngine().Plansso an index scan under aLimit(e.g. pgvectorORDER BY emb <-> $1 LIMIT k) is reported instead of just theLimitnode.nReturned/executionTimeMillisstraight fromexecutionStats._uid,_perms,__metadata, …) are stripped from the plan for display.Tests
recordPlanActuals(fill / error-skip / no-op), Postgres child-plan recursion.Summary by CodeRabbit
New Features
Tests