feat(query): pluggable family-based plan rendering — registry, routing, VictoriaMetrics + ClickHouse#67
feat(query): pluggable family-based plan rendering — registry, routing, VictoriaMetrics + ClickHouse#67qiansheng91 wants to merge 4 commits into
Conversation
Replace the hardcoded switch(storage.Kind) in build{Metric,Log}StorageQuery with
a renderer registry, mirroring how GraphStore providers sit behind a contract —
the one part of the query path that was not yet provider-ised.
- New internal/query/planrender package: Renderer interface, a normalized Request
(superset of get_metrics/get_logs inputs), and a Registry keyed by
(storage kind, method) with later-registration-wins override semantics.
- The existing prometheus and elasticsearch rendering moves behind two built-in
family renderers (label-timeseries, document-search) that wrap the existing
functions unchanged; the Executor holds an injected registry.
- Unknown storage kinds keep the existing passthrough, byte-identical.
Zero behavior change: the plan envelope and rendered query block are unchanged
(the full existing suite passes unmodified). Renderers are pure string builders
with no storage dependencies, so no build tags.
Phase 2 (separate PR): shared constraint IR + ResolveConstraints, an optional
storage family field for config-driven backends, and a new backend to prove
zero-code extension.
…ase 2a) Make storage->renderer routing family-based so a new backend that shares an existing query family needs no Go code, only configuration. - planrender.Renderer drops Handles(kind, method) for Family() + SupportsMethod: a renderer declares its query-model archetype and the methods it serves, not the storage products it knows about. - Registry.Find is keyed by (family, method); an empty family never matches. - The query package owns the kind->family bridge: defaultFamilyForKind maps the built-in kinds (prometheus/aliyun_prometheus -> label-timeseries, elasticsearch -> document-search). A storage overrides it via spec.family. - Both executor dispatches resolve family = spec.family || defaultFamilyForKind(kind) then Find(family, method); unknown families keep the existing passthrough. Zero behavior change for existing kinds (resolved family is identical; full suite passes). New: a victoriametrics-style storage renders a prometheus_promql plan via the label-timeseries renderer purely by declaring spec.family.
Demonstrate the family-routing extension point with a real, validated backend: VictoriaMetrics is PromQL-compatible, so it renders through the existing label-timeseries family by declaring spec.family — no Go renderer code. - schemas/core/storage/victoriametrics.schema.yaml: the new kind (PromQL-style spec + a family selector). Go/Python/Java SDK types and the expanded/embedded schemas are generated by 'make expand'; schema docs by 'make docs-schema'. - examples/extensible-storage/: a validated VM storage definition (spec.family: label-timeseries) + README walking the zero-code path. - registry_test.go: the manifest now carries 24 kinds.
… constraint IR (Phase 2b)
Prove the family registry against a genuinely different query syntax: SQL.
- ResolveConstraints (internal/query/constraints.go) extracts entity binding +
filter parsing into a syntax-neutral Constraint IR {Field, Op, Values}, reusing
the existing filter parser and field mappers. The sql-table renderer consumes
it; the Prometheus/Elasticsearch renderers are untouched (they can migrate to
the IR in a behavior-preserving follow-up).
- sqlTableRenderer (internal/query/sql_renderer.go) renders get_metrics as
ClickHouse SQL: a SELECT with the IR formatted as a WHERE clause, downsampled
with toStartOfInterval + avg, and {from}/{to} placeholders the caller fills
from the plan's time_range.
- clickhouse kind (schema-only; SDKs and docs generated) declares family: sql-table.
- examples/extensible-storage/ now shows both paths: VictoriaMetrics (same family,
pure config) and ClickHouse (new family, one renderer then config).
Existing behavior unchanged (full suite passes); new tests cover the IR, SQL
formatting, and end-to-end config-only routing to the new family.
Mr-Xzz
left a comment
There was a problem hiding this comment.
Reviewed the plan-renderer registry and SQL-table renderer. The registry direction looks good, but I found two blocking issues before this is safe to merge:
-
Unsupported/raw filters are silently omitted from the generated SQL.
ResolveConstraintsreturns unsupported filters inraw, butsqlTableRenderer.Renderonly echoes them asraw_filters;buildSQLMetricQueryreceives onlywhere, so the actualqueries[].sqlcan drop parts ofsearch_filter,data_filter,entity_query, or methodquery. That can make downstream execution much broader than requested. Please either render all accepted filter shapes into SQL or fail the SQL renderer whenrawis non-empty. If returning an error, the executor should not silently fall back to passthrough, because that also hides an unsafe plan-render failure. -
SQL identifiers and interval tokens are interpolated without validation/quoting.
table,time_column,value_column,metric_column, and constraint field names come from schema/mapping config, whilestepcan come from the query parameter. Values are quoted, but identifiers andINTERVALtext are not, so a malformed mapping or step can produce injectable SQL in the generated plan. Please validate/quote identifiers for the ClickHouse dialect and parsestepinto a strict numeric duration + allowlisted unit before formatting it.
Once those are fixed, I am happy to re-review the updated patch.
Makes EntitySet plan generation pluggable, so the storage / query-syntax layer is no longer a fixed switch. External stores and query languages are many; this layer should extend without a code change for each one. Four commits build the mechanism and prove it against two new backends.
1. Family registry
Replace the hardcoded
switch (storage.Kind)in the executor with aplanrender.Registry, mirroring how GraphStore providers sit behind a contract — the one part of the query path that wasn't yet provider-ised. Prometheus and Elasticsearch rendering move behind two built-in family renderers (label-timeseries,document-search) that wrap the existing functions unchanged. Renderers are pure string builders (the OSS service returns a plan, never connecting to storage), so there are no heavy deps and no build tags.2. Family-based routing
Route by family, not storage kind.
planrender.RendererdropsHandles(kind, method)forFamily()+SupportsMethod(method).Registry.Findis keyed by(family, method). The query package owns the kind→family bridge (defaultFamilyForKind), and a storage overrides it by declaringspec.family.3. VictoriaMetrics — same family, configuration only
VictoriaMetrics speaks the Prometheus query API, so it shares
label-timeseries. Declaringkind: victoriametrics+spec.family: label-timeseriesrenders it through the existing renderer with no Go code.4. A
sql-tablefamily + ClickHouse — a different query syntaxClickHouse uses SQL, a query model entirely different from PromQL. This is where the shared IR earns its keep:
ResolveConstraintsextracts entity binding + filter parsing into a syntax-neutralConstraint{Field, Op, Values}IR, reusing the existing filter parser and field mappers. Each family just formats the resolved constraints.sql-tablerenderer formats them as a SQLWHEREclause and rendersget_metricsasSELECT toStartOfInterval(…), avg(…) … GROUP BY …, with{from}/{to}placeholders the caller fills from the plan'stime_range(mirroring how the PromQL plan pairs a query with a separate time range).ResolveConstraintsin a behavior-preserving follow-up.Adding the
victoriametricsandclickhousekinds was schema-only — the Go / Python / Java SDK types and reference docs are generated bymake expandandmake docs-schema.examples/extensible-storage/is a validated walkthrough of both paths (same-family config, and new-family one-renderer-then-config).Zero behavior change
Existing kinds resolve to the same family and render byte-identical output — the full pre-existing suite passes unmodified. New tests cover the registry, family resolution, the constraint IR, SQL formatting, and end-to-end config-only routing to both new families (
TestFamilyOverrideRoutesNewBackendWithoutCode,TestClickHouseRoutesViaSQLFamilyWithoutCode).Verified:
go test ./...(21 pkgs, 0 fail) ·make guard verify-go test-service example-validate check-manifest schemas-embed-check docs-schema-check build-service.Scope
The executor side is untouched — the
dialect/queryshape is byte-identical, so existing plan consumers are unaffected. Migrating the Prometheus/Elasticsearch renderers ontoResolveConstraintsis a natural behavior-preserving follow-up.