-
Notifications
You must be signed in to change notification settings - Fork 16.3k
refactor: Migrates the MCP execute_sql tool to use the SQL execution API
#36739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
refactor: Migrates the MCP execute_sql tool to use the SQL execution API
#36739
Conversation
…abase.execute()` API
|
@aminghadersohi @betodealmeida @villebro @Antonio-RiveroMartnez I'll open separate threads for some questions that I have regarding this consolidation. Please answer in each thread so we don't lose track 😉 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #36739 +/- ##
===========================================
+ Coverage 0 68.26% +68.26%
===========================================
Files 0 638 +638
Lines 0 47520 +47520
Branches 0 5180 +5180
===========================================
+ Hits 0 32438 +32438
- Misses 0 13804 +13804
- Partials 0 1278 +1278
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| ) | ||
|
|
||
| # 2. Build QueryOptions | ||
| options = QueryOptions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming we want all queries from MCP tools to be audited so I kept the default behavior of adding entries to the query table. Let me know if there's any case where we don't want this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auditing by default makes sense for MCP queries. good for compliance and debugging AI agent behavior. we can add a config flag later if performance becomes a concern at high volume.
| timeout_seconds=request.timeout, | ||
| template_params=request.template_params, | ||
| dry_run=request.dry_run, | ||
| cache=CacheOptions(force_refresh=True), # No caching for MCP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, query caching is disabled. Is this what we want for MCP tools or should queries be cached? I know that MCP has a middleware that caches tool calls but this is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given the rate at which some clients can call the mcp with the same "list datasets" call i would say we need caching at this layer to save the database from melting. what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. We have the following options:
- Pass a specific timeout for MCP tools with
CacheOptions.timeout - Use the per-database setting with
database.cache_timeout - Use the default config
CACHE_DEFAULT_TIMEOUT= 300 seconds (5 minutes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe default to a short TTL (30-60s) as a middle ground
can tools op out if they need?
the MCP middleware caching operates at a different layer (request/response), so DB query caching still adds value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could:
- Enable caching by default in the tool
- Add a
force_refreshfield toExecuteSqlRequestwith clear instructions to LLMs:
force_refresh: bool = Field(
default=False,
description=(
"Bypass cache and re-execute query. "
"IMPORTANT: Only set to true when the user EXPLICITLY requests "
"fresh/updated data (e.g., 'refresh', 'get latest', 're-run'). "
"Default to false to reduce database load."
),
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I applied the above changes on 0d68701
| ) | ||
| parameters: dict[str, Any] | None = Field( | ||
| None, description="Parameters for query substitution" | ||
| template_params: dict[str, Any] | None = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced the { parameter } syntax with Jinja {{ parameter }} as I'm assuming that would be ok. Let me know if you see a problem with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just leave a note about the breaking change in UPDATING.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aminghadersohi The MCP is not part of an official release yet and UPDATING.md is reserved for when you are upgrading between versions. By the way, we should remove this documentation from UPDATING.md. This is more appropriate for Release Notes or a page in the Developer Portal. @aminghadersohi Could you open a PR to remove those docs from UPDATING.md?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @sadpandajoe as the release manager.
execute_sql tool to use the unified Database.execute() APIexecute_sql tool to use the SQL execution API
villebro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Noticed what's likely minor artifact from earlier iterations, but other than that looks good to go.
| if TYPE_CHECKING: | ||
| pass | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if TYPE_CHECKING: | |
| pass |
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| from typing import Any, TYPE_CHECKING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| from typing import Any, TYPE_CHECKING | |
| from typing import Any |
Summary
Migrates the MCP
execute_sqltool to use the unifiedDatabase.execute()API introduced in #36529. This consolidates SQL execution code and fixes a critical security gap where Row-Level Security (RLS) was not being applied to MCP queries.Key Changes
sql_lab_utils.pyandexecute_sql_core.py(~300 lines), replaced with directDatabase.execute()call{placeholder}syntax is removed. Use Jinja2{{ placeholder }}syntax withtemplate_params.cataloganddry_runparametersFeature Comparison
allow_dmlsettingtemplate_paramsfieldstatementsfield in responseforce_refreshdatabase_idsqlschemacataloglimittimeouttemplate_paramsdry_runTESTING INSTRUCTIONS
execute_sqlunit tests passADDITIONAL INFORMATION