Skip to content

feat(query): pluggable family-based plan rendering — registry, routing, VictoriaMetrics + ClickHouse#67

Open
qiansheng91 wants to merge 4 commits into
mainfrom
codex/plan-renderer-registry
Open

feat(query): pluggable family-based plan rendering — registry, routing, VictoriaMetrics + ClickHouse#67
qiansheng91 wants to merge 4 commits into
mainfrom
codex/plan-renderer-registry

Conversation

@qiansheng91

@qiansheng91 qiansheng91 commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Makes EntitySet plan generation pluggable, so the storage / query-syntax layer is no longer a fixed switch. External stores and query languages are many; this layer should extend without a code change for each one. Four commits build the mechanism and prove it against two new backends.

1. Family registry

Replace the hardcoded switch (storage.Kind) in the executor with a planrender.Registry, mirroring how GraphStore providers sit behind a contract — the one part of the query path that wasn't yet provider-ised. Prometheus and Elasticsearch rendering move behind two built-in family renderers (label-timeseries, document-search) that wrap the existing functions unchanged. Renderers are pure string builders (the OSS service returns a plan, never connecting to storage), so there are no heavy deps and no build tags.

2. Family-based routing

Route by family, not storage kind. planrender.Renderer drops Handles(kind, method) for Family() + SupportsMethod(method). Registry.Find is keyed by (family, method). The query package owns the kind→family bridge (defaultFamilyForKind), and a storage overrides it by declaring spec.family.

3. VictoriaMetrics — same family, configuration only

VictoriaMetrics speaks the Prometheus query API, so it shares label-timeseries. Declaring kind: victoriametrics + spec.family: label-timeseries renders it through the existing renderer with no Go code.

4. A sql-table family + ClickHouse — a different query syntax

ClickHouse uses SQL, a query model entirely different from PromQL. This is where the shared IR earns its keep:

  • ResolveConstraints extracts entity binding + filter parsing into a syntax-neutral Constraint{Field, Op, Values} IR, reusing the existing filter parser and field mappers. Each family just formats the resolved constraints.
  • The sql-table renderer formats them as a SQL WHERE clause and renders get_metrics as SELECT toStartOfInterval(…), avg(…) … GROUP BY …, with {from}/{to} placeholders the caller fills from the plan's time_range (mirroring how the PromQL plan pairs a query with a separate time range).
  • The Prometheus/Elasticsearch renderers are untouched — they can migrate to ResolveConstraints in a behavior-preserving follow-up.

Adding the victoriametrics and clickhouse kinds was schema-only — the Go / Python / Java SDK types and reference docs are generated by make expand and make docs-schema. examples/extensible-storage/ is a validated walkthrough of both paths (same-family config, and new-family one-renderer-then-config).

Zero behavior change

Existing kinds resolve to the same family and render byte-identical output — the full pre-existing suite passes unmodified. New tests cover the registry, family resolution, the constraint IR, SQL formatting, and end-to-end config-only routing to both new families (TestFamilyOverrideRoutesNewBackendWithoutCode, TestClickHouseRoutesViaSQLFamilyWithoutCode).

Verified: go test ./... (21 pkgs, 0 fail) · make guard verify-go test-service example-validate check-manifest schemas-embed-check docs-schema-check build-service.

Scope

The executor side is untouched — the dialect / query shape is byte-identical, so existing plan consumers are unaffected. Migrating the Prometheus/Elasticsearch renderers onto ResolveConstraints is a natural behavior-preserving follow-up.

Replace the hardcoded switch(storage.Kind) in build{Metric,Log}StorageQuery with
a renderer registry, mirroring how GraphStore providers sit behind a contract —
the one part of the query path that was not yet provider-ised.

- New internal/query/planrender package: Renderer interface, a normalized Request
  (superset of get_metrics/get_logs inputs), and a Registry keyed by
  (storage kind, method) with later-registration-wins override semantics.
- The existing prometheus and elasticsearch rendering moves behind two built-in
  family renderers (label-timeseries, document-search) that wrap the existing
  functions unchanged; the Executor holds an injected registry.
- Unknown storage kinds keep the existing passthrough, byte-identical.

Zero behavior change: the plan envelope and rendered query block are unchanged
(the full existing suite passes unmodified). Renderers are pure string builders
with no storage dependencies, so no build tags.

Phase 2 (separate PR): shared constraint IR + ResolveConstraints, an optional
storage family field for config-driven backends, and a new backend to prove
zero-code extension.
@qiansheng91 qiansheng91 added the enhancement New feature or request label Jun 17, 2026
…ase 2a)

Make storage->renderer routing family-based so a new backend that shares an
existing query family needs no Go code, only configuration.

- planrender.Renderer drops Handles(kind, method) for Family() + SupportsMethod:
  a renderer declares its query-model archetype and the methods it serves, not
  the storage products it knows about.
- Registry.Find is keyed by (family, method); an empty family never matches.
- The query package owns the kind->family bridge: defaultFamilyForKind maps the
  built-in kinds (prometheus/aliyun_prometheus -> label-timeseries, elasticsearch
  -> document-search). A storage overrides it via spec.family.
- Both executor dispatches resolve family = spec.family || defaultFamilyForKind(kind)
  then Find(family, method); unknown families keep the existing passthrough.

Zero behavior change for existing kinds (resolved family is identical; full suite
passes). New: a victoriametrics-style storage renders a prometheus_promql plan via
the label-timeseries renderer purely by declaring spec.family.
Demonstrate the family-routing extension point with a real, validated backend:
VictoriaMetrics is PromQL-compatible, so it renders through the existing
label-timeseries family by declaring spec.family — no Go renderer code.

- schemas/core/storage/victoriametrics.schema.yaml: the new kind (PromQL-style
  spec + a family selector). Go/Python/Java SDK types and the expanded/embedded
  schemas are generated by 'make expand'; schema docs by 'make docs-schema'.
- examples/extensible-storage/: a validated VM storage definition
  (spec.family: label-timeseries) + README walking the zero-code path.
- registry_test.go: the manifest now carries 24 kinds.
@qiansheng91 qiansheng91 changed the title refactor(query): pluggable plan-renderer family registry (Phase 1, zero behavior change) feat(query): pluggable plan rendering — family registry + routing + a config-only VictoriaMetrics backend Jun 17, 2026
… constraint IR (Phase 2b)

Prove the family registry against a genuinely different query syntax: SQL.

- ResolveConstraints (internal/query/constraints.go) extracts entity binding +
  filter parsing into a syntax-neutral Constraint IR {Field, Op, Values}, reusing
  the existing filter parser and field mappers. The sql-table renderer consumes
  it; the Prometheus/Elasticsearch renderers are untouched (they can migrate to
  the IR in a behavior-preserving follow-up).
- sqlTableRenderer (internal/query/sql_renderer.go) renders get_metrics as
  ClickHouse SQL: a SELECT with the IR formatted as a WHERE clause, downsampled
  with toStartOfInterval + avg, and {from}/{to} placeholders the caller fills
  from the plan's time_range.
- clickhouse kind (schema-only; SDKs and docs generated) declares family: sql-table.
- examples/extensible-storage/ now shows both paths: VictoriaMetrics (same family,
  pure config) and ClickHouse (new family, one renderer then config).

Existing behavior unchanged (full suite passes); new tests cover the IR, SQL
formatting, and end-to-end config-only routing to the new family.
@qiansheng91 qiansheng91 changed the title feat(query): pluggable plan rendering — family registry + routing + a config-only VictoriaMetrics backend feat(query): pluggable family-based plan rendering — registry, routing, VictoriaMetrics + ClickHouse Jun 17, 2026
@qiansheng91 qiansheng91 self-assigned this Jun 18, 2026
@qiansheng91 qiansheng91 requested a review from Mr-Xzz June 18, 2026 02:26

@Mr-Xzz Mr-Xzz left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the plan-renderer registry and SQL-table renderer. The registry direction looks good, but I found two blocking issues before this is safe to merge:

  1. Unsupported/raw filters are silently omitted from the generated SQL. ResolveConstraints returns unsupported filters in raw, but sqlTableRenderer.Render only echoes them as raw_filters; buildSQLMetricQuery receives only where, so the actual queries[].sql can drop parts of search_filter, data_filter, entity_query, or method query. That can make downstream execution much broader than requested. Please either render all accepted filter shapes into SQL or fail the SQL renderer when raw is non-empty. If returning an error, the executor should not silently fall back to passthrough, because that also hides an unsafe plan-render failure.

  2. SQL identifiers and interval tokens are interpolated without validation/quoting. table, time_column, value_column, metric_column, and constraint field names come from schema/mapping config, while step can come from the query parameter. Values are quoted, but identifiers and INTERVAL text are not, so a malformed mapping or step can produce injectable SQL in the generated plan. Please validate/quote identifiers for the ClickHouse dialect and parse step into a strict numeric duration + allowlisted unit before formatting it.

Once those are fixed, I am happy to re-review the updated patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants