Skip to content

fix(query): enforce ClickHouse resource caps server-side#335

Merged
EricAndrechek merged 13 commits into
mainfrom
ch-exec-limits
Jun 12, 2026
Merged

fix(query): enforce ClickHouse resource caps server-side#335
EricAndrechek merged 13 commits into
mainfrom
ch-exec-limits

Conversation

@EricAndrechek

@EricAndrechek EricAndrechek commented Jun 10, 2026

Copy link
Copy Markdown
Member

Summary

Closes #316. The native ClickHouse connection opened with no per-query Settings, so a public read's policy resource caps were only ever enforced client-side — a Go context deadline (which cancels only while the client is reading result blocks) plus a SQL LIMIT. Memory and rows scanned were never bounded server-side, so a heavy aggregation could allocate gigabytes of state or scan an entire table within the time budget; and clickhouse-go only derives max_execution_time from a context deadline > 1s, so a sub-second time cap reached ClickHouse with no server-side bound at all.

This attaches per-query clickhouse.Settings derived from the role's caps so a query can't outrun its budget during a server-side scan / merge / aggregation phase.

Where the line sits (WaveHouse vs ClickHouse)

The design draws a deliberate boundary:

  • WaveHouse owns the dynamic, per-role caps — they're derived from the JWT-resolved policy, which ClickHouse RBAC can't express without a credential per role, so WaveHouse sends them as per-query settings on its single connection.
  • ClickHouse owns the static, server-wide backstop — the limit that applies to every query regardless of role (named pipes and raw admin SQL included) lives in ClickHouse's own settings profiles and quotas. It enforces uniformly, composes with the per-role caps, and holds even against a WaveHouse bug. The docs explain how to configure it (with users.xml examples) and how the two layers compose (a readonly/<constraints> lock on WaveHouse's CH user will reject its per-query overrides).

So there's no WaveHouse global resource knob — just the per-role policy caps plus one query-shaping default (query.default_max_rows, the formerly hard-coded DefaultMaxRows = 10000, now tunable).

Enforcement

internal/api/ch_settings.go builds the per-query settings; wired into the structured-query handler:

Per-role cap ClickHouse setting
max_execution_time max_execution_time (fractional seconds — emitted explicitly so the sub-second case is enforced)
max_rows max_result_rows + result_overflow_mode=throw (defense-in-depth behind the SQL LIMIT)
max_rows_to_read max_rows_to_read + read_overflow_mode=throw
max_memory_usage max_memory_usage

A query that exceeds its budget is rejected by the server (ClickHouse codes 158 / 241). Admin resolves to no caps → no settings sent (bounded only by ClickHouse's own config).

Policy field ergonomics (human-in / number-out)

The two human-scale fields accept a friendly string or a number on input and always return the canonical number on read-back, so SDKs never reimplement humanization:

  • max_execution_time — set as "5s"/"500ms" or a bare number of milliseconds; returned as ms.
  • max_memory_usage — set as "4GiB"/"512MiB" (IEC vs SI respected via dustin/go-humanize, so 4GB = 4×10⁹ ≠ 4GiB = 4×2³⁰) or a bare number of bytes; returned as bytes.

This is backed by two small Millis/ByteSize types in the policy package — no new package.

Tests

  • Unit: scalar round-trip (string|number in, number out, SI/IEC, sub-ms fail-closed); the cap→setting mapping; config parse + validation; the configurable builder default; policy Evaluate carry-through + negative validation.
  • Integration (real ClickHouse): a non-admin viewer with max_rows_to_read=1 / max_memory_usage=1 is rejected server-side (CH codes 158 / 241); the same-shape uncapped control returns all rows. Verified RED before the fix.
  • E2E (full SDK → WaveHouse → ClickHouse): the same enforcement on the public path.

Docs

access-control.mdx (per-role caps, human-in/number-out, "server-wide limits live in ClickHouse") and configuration.mdx (new "Server-side resource limits" section with ClickHouse profile/quota XML + the composition caution); config.yaml, the SDK RolePermissions type, and the CHANGELOG updated.

make ci green locally (unit + integration + e2e + coverage); pre-push code + docs reviewers: ship_it.

🤖 Generated with Claude Code

…-query Settings

The native ClickHouse connection sent no per-query Settings, so a public
structured read's policy caps only ever bound it client-side: a Go context
deadline (which cancels while the client reads result blocks) plus a SQL
LIMIT. Memory and rows scanned were never bounded server-side, so a heavy
aggregation could allocate GBs of state or scan an entire table well within
the time budget — and a sub-second max_execution_time_ms cap reached
ClickHouse with no server-side bound at all (clickhouse-go only derives
max_execution_time from context deadlines > 1s).

Attach per-query Settings derived from the role's resolved permissions on the
structured-query path so a query can't outrun its budget during a server-side
scan / merge / aggregation phase:

  - max_execution_time    <- max_execution_time_ms (fractional seconds; closes
                             the sub-second gap the driver heuristic misses)
  - max_result_rows+throw <- max_rows (defense-in-depth behind the SQL LIMIT)
  - max_rows_to_read+throw<- new max_rows_to_read policy field (rows scanned)
  - max_memory_usage      <- new max_memory_usage_bytes policy field (peak mem)

A 0 cap means "no limit" and is omitted. The pipe path carries no per-role
ResolvedPermissions and is left unchanged (separately tracked).

Tests:
  - internal/api: unit test of the cap->setting mapping (chReadSettings)
  - internal/policy: Evaluate carry-through + negative validation
  - tests/integration: handler-level proof vs real ClickHouse — a viewer with
    max_rows_to_read=1 / max_memory_usage_bytes=1 is rejected (CH code 158 /
    241); the uncapped control returns all rows. Verified RED before the fix.
  - tests/e2e: same enforcement through the full SDK -> WaveHouse -> ClickHouse path
  - docs: access-control.mdx resource-limits section + field table

Closes #316

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 81ea88f6-d910-4043-9c50-176434dcdac5

📥 Commits

Reviewing files that changed from the base of the PR and between 168822e and 997e1b6.

📒 Files selected for processing (3)
  • docs/src/content/docs/access-control.mdx
  • internal/policy/policy.go
  • internal/policy/scalars.go
📜 Recent review details
⏰ Context from checks skipped due to timeout of 300000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Analyze (go)
  • GitHub Check: Coverage
  • GitHub Check: E2E tests
🧰 Additional context used
📓 Path-based instructions (3)
docs/src/content/docs/**/*.{md,mdx}

📄 CodeRabbit inference engine (AGENTS.md)

docs/src/content/docs/**/*.{md,mdx}: Author Mermaid diagrams vertically (flowchart TB/TD) to fit page column width (~46–58rem); reserve LR for genuinely short chains (≤3–4 nodes)
Keep Mermaid node labels short; use
for a second line rather than one long line; lean on semantic node classes (wh, win, pain, fail, infra, neutral, store, client)
Never sit two large diagrams side-by-side; wrap comparisons in

to stack them vertically

Files:

  • docs/src/content/docs/access-control.mdx
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Use Go 1.26 with strict formatting enforced by gofumpt
Use structured logging with log/slog (JSON handler)
Use Chi v5 for HTTP routing
Return errors, don't panic. Wrap with fmt.Errorf("context: %w", err)
Use package naming: lowercase, single word (or abbreviated). internal/ enforces module privacy
No global state: Dependencies are passed explicitly (constructor injection)
Comment the why, not the what. Add a comment only when the reason isn't obvious from the code; a line that matches the surrounding pattern needs none. Keep comments to 1–2 lines
DRY — one source of truth. Before adding logic, look for an existing helper, type, or constant to reuse; before duplicating a rule, factor it into one place every caller reads
Leave it neater than you found it — within reason. Fix small, safe things in passing: a stale comment, an obvious typo, a misnamed local, dead code on your path

Files:

  • internal/policy/scalars.go
  • internal/policy/policy.go
internal/policy/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Hasura-style access control: Policy/TablePolicy/RolePermissions types, Evaluate() engine with JWT claim templating, NATS KV store (WAVEHOUSE_POLICY). policy.IsAdmin (role == admin_role, exact case-sensitive, default "admin") is the single admin check

Files:

  • internal/policy/scalars.go
  • internal/policy/policy.go
🪛 LanguageTool
docs/src/content/docs/access-control.mdx

[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ...et max_memory_usage as a size string ("4GiB", "512MiB" — IEC vs SI is respected,...

(UNIT_SPACE)


[typographical] ~307-~307: In American English, use a period after an abbreviation.
Context: ...size string ("4GiB", "512MiB" — IEC vs SI is respected, so "4GB" is 4×10<sup...

(MISSING_PERIOD_AFTER_ABBREVIATION)


[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ..."512MiB" — IEC vs SI is respected, so "4GB" is 4×109 and "4GiB" is 4...

(UNIT_SPACE)


[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ...ted, so "4GB" is 4×109 and "4GiB" is 4×230) or a bare number...

(UNIT_SPACE)

🔇 Additional comments (3)
internal/policy/scalars.go (1)

89-91: LGTM!

internal/policy/policy.go (1)

41-50: LGTM!

docs/src/content/docs/access-control.mdx (1)

307-307: LGTM!


📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Server-side enforcement of per-role query resource limits: execution time, rows scanned, memory, and a configurable default max result rows (fallback 10,000).
  • Documentation

    • Updated access-control, configuration, API, and SDK docs with new fields, formats, semantics, and enforcement caveats.
  • Tests

    • Added/updated unit, integration, and end-to-end tests validating scalar parsing, settings translation, and ClickHouse-side enforcement.
  • Changelog

    • Added Unreleased → Security entry describing the new caps and default-limit behavior.

Walkthrough

Derive per-query ClickHouse Settings from resolved role permissions (time, result rows, rows-to-read, memory), apply them to structured-query execution, add typed policy scalars and configurable query.default_max_rows, and update config, tests, and docs accordingly.

Changes

Server-side Query Resource Caps

Layer / File(s) Summary
Policy scalars and limit schema
internal/policy/scalars.go, internal/policy/scalars_test.go, internal/policy/policy.go, internal/policy/policy_test.go, clients/ts/src/types.ts
Adds Millis and ByteSize scalars; replaces MaxExecutionTimeMs with typed MaxExecutionTime, adds MaxRowsToRead and MaxMemoryUsage; updates evaluation and validation and tests; updates TS RolePermissions.
Configuration defaults and validation
internal/config/config.go, internal/config/config_test.go, config.yaml
Adds query.default_max_rows with YAML/env wiring and default 10000; validates non-negative values; updates example config and tests.
ClickHouse settings builder
internal/api/ch_settings.go, internal/api/ch_settings_test.go, go.mod
Adds chQueryLimits and chReadSettings to translate policy caps into ClickHouse Settings (max_execution_time, max_result_rows + overflow, max_rows_to_read + overflow, max_memory_usage); promotes go-humanize dependency.
Query builder with configurable default max rows
internal/query/builder.go, internal/query/builder_test.go, tests/integration/identifier_roundtrip_test.go
query.Build gains defaultMaxRows parameter; LIMIT generation uses provided default with fallback behavior; tests updated to pass the parameter.
Structured query handler and execution wiring
internal/api/structured_query.go, internal/api/structured_query_test.go, cmd/wavehouse/main.go
NewStructuredQueryHandler accepts/stores defaultMaxRows; handler builds chQueryLimits from resolved permissions, derives settings via chReadSettings, and applies them with clickhouse.WithSettings before executing SQL; switches to typed MaxExecutionTime.
Client types and documentation
clients/ts/src/types.ts, docs/src/content/docs/access-control.mdx, docs/src/content/docs/configuration.mdx, docs/src/content/docs/api.md, docs/src/content/docs/sdk/queries.md, CHANGELOG.md
TS RolePermissions updated with new resource-limit fields; docs updated to describe formats, enforcement locations, and query.default_max_rows; changelog entry added.
Integration, E2E, and test updates
tests/integration/query_limits_test.go, tests/e2e/sdk/query.test.ts, various builder/identifier tests
Adds integration test verifying server-side enforcement (ClickHouse error codes for row/memory violations); updates E2E to use new max_execution_time format and add regression tests for max_rows_to_read and max_memory_usage; updates many tests for new Build/handler signatures.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Wave-RF/WaveHouse#177: The main PR builds on the retrieved caching/timeout refactor of internal/api/structured_query.go by extending NewStructuredQueryHandler and the structured-query execution path to add defaultMaxRows and per-role ClickHouse Settings.
  • Wave-RF/WaveHouse#164: Overlaps on role-permissions/policy schema changes and ResolvedPermissions wiring.
  • Wave-RF/WaveHouse#172: Related edits to the RBAC/policy engine (Evaluate/validateRolePerms) that touch similar code paths.

Suggested reviewers

  • taitelee
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(query): enforce ClickHouse resource caps server-side' accurately reflects the main objective—enforcing server-side resource limits via per-query ClickHouse settings.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing the implementation of server-side resource cap enforcement, the boundary between WaveHouse and ClickHouse responsibilities, and testing/documentation coverage.
Linked Issues check ✅ Passed The PR fully addresses issue #316 by implementing per-query ClickHouse Settings enforcement for max_execution_time, max_memory_usage, max_rows_to_read, and max_result_rows, with appropriate overflow modes and tests confirming server-side rejection.
Out of Scope Changes check ✅ Passed All changes are directly scoped to #316 requirements: schema updates (RolePermissions, policy scalars), new per-query settings builder (ch_settings.go), structured-query handler wiring, configuration for tunable default limits, comprehensive testing, and documentation updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ch-exec-limits
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch ch-exec-limits

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added documentation Improvements or additions to documentation go Pull requests that update go code area/api HTTP handlers, routing, middleware area/policy Access control policies (Hasura-style) area/sdk TypeScript SDK (clients/ts/) area/docs Documentation, site/, README labels Jun 10, 2026
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown

📚 Docs preview is livehttps://47c50767-wavehouse-docs.wave-rf.workers.dev

  • Commit997e1b6: Merge remote-tracking branch 'origin/ch-exec-limits' into ch-exec-limits
  • Author@EricAndrechek
  • Committed — 2026-06-12 09:26 (UTC-04:00)
  • Deployed — 2026-06-12 09:33 EDT

@EricAndrechek EricAndrechek moved this from Backlog to In progress in WaveHouse Task Board Jun 10, 2026
@EricAndrechek EricAndrechek moved this from In progress to In review in WaveHouse Task Board Jun 10, 2026
@github-code-quality

github-code-quality Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Overview

Languages: Go

Go

The overall coverage in the branch remains at 89%, unchanged from the branch.

Show a code coverage summary of the most impacted files.
File d216c97 997e1b6 +/-
internal/api/st...ctured_query.go 98% 98% 0%
internal/policy/policy.go 98% 98% 0%
internal/query/builder.go 95% 95% 0%
internal/config/config.go 94% 94% 0%
internal/api/cl...ckhouse_exec.go 82% 83% +1%
internal/policy/scalars.go 0% 87% +87%
internal/api/ch_settings.go 0% 100% +100%

Updated June 12, 2026 13:34 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

EricAndrechek and others added 4 commits June 10, 2026 20:05
… defaults

Refines the #316 resource-cap work (per review/design discussion) before the
schema ships:

Human-readable, config-consistent policy fields (pre-launch rename):
  - max_execution_time_ms (int) -> max_execution_time (Go duration string "5s")
  - max_memory_usage_bytes (int) -> max_memory_usage (size string "4GiB")
  - max_rows_to_read stays an int
Backed by a new internal/units package: Duration + ByteSize, each with a single
TextMarshaler/TextUnmarshaler pair that serves JSON, YAML, and env (encoding/json
and yaml.v3 both route string scalars through the Text methods; cleanenv uses
TextUnmarshaler). ByteSize parses binary RAM semantics via docker/go-units
(already an indirect dep; now direct): "2G" == "2GB" == "2GiB".

Server-wide defaults — new query_limits config block:
  - default_max_rows (10000) makes the formerly hard-coded query.DefaultMaxRows
    a documented, tunable knob (Build now takes it as a parameter)
  - default_max_rows_to_read (0 = off) and default_max_memory_usage (4GiB)
Applied to every NON-ADMIN read on BOTH the structured-query and named-pipe
paths (closing the pipe-path gap), enforced server-side. Precedence is
most-specific-wins: a per-table-per-role policy cap overrides the default;
admin bypasses these DoS backstops entirely.

Tests: internal/units round-trip (JSON/YAML/env); config parse + validation;
configurable builder default; chReadSettings + resolveReadBudget precedence;
integration cases for the global default + admin bypass (structured) and the
pipe path. Docs (access-control.mdx, configuration.mdx), config.yaml, SDK
RolePermissions type, and CHANGELOG updated for the new schema.

Refs #316

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added dependencies Pull requests that update a dependency file area/query Structured query AST, SQL builder area/infra CI, build, deploy, Docker, release labels Jun 11, 2026
Reworks the resource-cap design (per discussion) so the WaveHouse/ClickHouse
boundary is principled: WaveHouse owns the DYNAMIC, per-role/per-claim caps
(sent as per-query settings on its single connection — what ClickHouse RBAC
can't express without a credential per role); the STATIC, server-wide backstop
lives in ClickHouse's own settings profiles and quotas, where it enforces
uniformly across every query (named pipes and raw admin SQL included) and
composes with the per-role caps.

Changes vs the prior iteration:
- Drop the WaveHouse global query_limits DoS defaults (default_max_rows_to_read,
  default_max_memory_usage) and the admin-bypass/resolveReadBudget machinery.
  Config keeps only query.default_max_rows (the result-LIMIT pagination knob).
- Revert the pipe-path per-query settings — pipes are governed by ClickHouse's
  server-wide config + query_timeout, not per-role policy.
- Delete internal/units. The two value types move into the policy package
  (internal/policy/scalars.go: Millis, ByteSize) and switch to "human-in,
  number-out": UnmarshalJSON/YAML accept a string ("5s"/"4GiB") OR a bare
  number (ms/bytes); no Marshaler, so reads return the canonical number and
  SDKs never reimplement humanization. ByteSize parses via dustin/go-humanize,
  so 4GB (10^9) and 4GiB (2^30) differ correctly (go-units, which conflated
  them, is dropped back to indirect).
- Docs: new "Server-side resource limits" section in configuration.mdx with
  ClickHouse settings-profile / quota XML examples and the readonly/constraints
  composition rule; access-control.mdx points global governance at ClickHouse
  and documents the human-in/number-out fields.

Tests: policy scalar round-trip (string|number in, number out, SI/IEC),
config query.default_max_rows, integration per-role enforcement (codes
158/241). SDK type, config.yaml, CHANGELOG updated.

Refs #316

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 179baf7f-b359-4a16-a526-adf4a8c2ffa2

📥 Commits

Reviewing files that changed from the base of the PR and between 26b9866 and b50ee60.

📒 Files selected for processing (22)
  • CHANGELOG.md
  • clients/ts/src/types.ts
  • cmd/wavehouse/main.go
  • config.yaml
  • docs/src/content/docs/access-control.mdx
  • docs/src/content/docs/configuration.mdx
  • go.mod
  • internal/api/ch_settings.go
  • internal/api/ch_settings_test.go
  • internal/api/structured_query.go
  • internal/api/structured_query_test.go
  • internal/config/config.go
  • internal/config/config_test.go
  • internal/policy/policy.go
  • internal/policy/policy_test.go
  • internal/policy/scalars.go
  • internal/policy/scalars_test.go
  • internal/query/builder.go
  • internal/query/builder_test.go
  • tests/e2e/sdk/query.test.ts
  • tests/integration/identifier_roundtrip_test.go
  • tests/integration/query_limits_test.go
📜 Review details
⏰ Context from checks skipped due to timeout of 300000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Coverage
🧰 Additional context used
📓 Path-based instructions (10)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Use Go 1.26 with strict formatting enforced by gofumpt
Use structured logging with log/slog (JSON handler)
Use Chi v5 for HTTP routing
Return errors, don't panic. Wrap with fmt.Errorf("context: %w", err)
Use package naming: lowercase, single word (or abbreviated). internal/ enforces module privacy
No global state: Dependencies are passed explicitly (constructor injection)
Comment the why, not the what. Add a comment only when the reason isn't obvious from the code; a line that matches the surrounding pattern needs none. Keep comments to 1–2 lines
DRY — one source of truth. Before adding logic, look for an existing helper, type, or constant to reuse; before duplicating a rule, factor it into one place every caller reads
Leave it neater than you found it — within reason. Fix small, safe things in passing: a stale comment, an obvious typo, a misnamed local, dead code on your path

Files:

  • internal/api/ch_settings.go
  • internal/policy/scalars.go
  • internal/api/ch_settings_test.go
  • internal/api/structured_query_test.go
  • tests/integration/query_limits_test.go
  • cmd/wavehouse/main.go
  • internal/policy/policy_test.go
  • internal/policy/policy.go
  • internal/config/config_test.go
  • tests/integration/identifier_roundtrip_test.go
  • internal/config/config.go
  • internal/policy/scalars_test.go
  • internal/api/structured_query.go
  • internal/query/builder.go
  • internal/query/builder_test.go
internal/api/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Chi HTTP router, JWT/JWKS middleware (from auth/), ingest/query/structured-query/SSE/schema/DLQ/policy/pipes handlers, Hub

Files:

  • internal/api/ch_settings.go
  • internal/api/ch_settings_test.go
  • internal/api/structured_query_test.go
  • internal/api/structured_query.go
internal/policy/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Hasura-style access control: Policy/TablePolicy/RolePermissions types, Evaluate() engine with JWT claim templating, NATS KV store (WAVEHOUSE_POLICY). policy.IsAdmin (role == admin_role, exact case-sensitive, default "admin") is the single admin check

Files:

  • internal/policy/scalars.go
  • internal/policy/policy_test.go
  • internal/policy/policy.go
  • internal/policy/scalars_test.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Use table-driven tests with tests := []struct{ name string; ... } and t.Run(tt.name, ...)
Use shared mocks from internal/testutil/ (MockPublisher, MockCache, MockDeduplicator, MockSubscriber) instead of creating ad-hoc mocks
Use testutil.MakeJWT(t, claims) and testutil.MakeExpiredJWT(t, claims) for auth tests
Use testutil.NewTestSchemaRegistry(tables) or discovery.NewSchemaRegistryFromMap(tables) for schema-aware tests
Use policy.NewMemoryStore(p) for in-memory policy testing without NATS
Use pipes.NewMemoryStore(queries...) for in-memory pipes testing without NATS
Use testutil.AssertJSONResponse(t, rec, status, expected) and testutil.AssertJSONContains(t, rec, status, substring) for response assertions

Files:

  • internal/api/ch_settings_test.go
  • internal/api/structured_query_test.go
  • tests/integration/query_limits_test.go
  • internal/policy/policy_test.go
  • internal/config/config_test.go
  • tests/integration/identifier_roundtrip_test.go
  • internal/policy/scalars_test.go
  • internal/query/builder_test.go
docs/src/content/docs/**/*.{md,mdx}

📄 CodeRabbit inference engine (AGENTS.md)

docs/src/content/docs/**/*.{md,mdx}: Author Mermaid diagrams vertically (flowchart TB/TD) to fit page column width (~46–58rem); reserve LR for genuinely short chains (≤3–4 nodes)
Keep Mermaid node labels short; use
for a second line rather than one long line; lean on semantic node classes (wh, win, pain, fail, infra, neutral, store, client)
Never sit two large diagrams side-by-side; wrap comparisons in

to stack them vertically

Files:

  • docs/src/content/docs/configuration.mdx
  • docs/src/content/docs/access-control.mdx
tests/integration/**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

Go integration tests (//go:build integration; ClickHouse testcontainer); run via make test-integration

Files:

  • tests/integration/query_limits_test.go
  • tests/integration/identifier_roundtrip_test.go
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

TypeScript SDK (@wavehouse/sdk): zero-dep client, typed query builder, real-time SSE, live queries (incrementable/decomposable/poll aggregation), codegen CLI

Files:

  • clients/ts/src/types.ts
  • tests/e2e/sdk/query.test.ts
internal/config/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

YAML + env var config loading (cleanenv); add field with yaml, env, and env-default tags

Files:

  • internal/config/config_test.go
  • internal/config/config.go
tests/e2e/sdk/**/*.test.ts

📄 CodeRabbit inference engine (AGENTS.md)

tests/e2e/sdk/**/*.test.ts: E2E integration tests via TypeScript SDK (Vitest); each test file owns its own ClickHouse tables (clicks_/events_/users_); add suite name to SUITES in tables.ts and get names via const T = suiteTables("")
Test files run sequentially (maxWorkers: 1 in vitest.config.ts) due to shared global policy state; policy-mutating tests snapshot the full policy and restore it

Files:

  • tests/e2e/sdk/query.test.ts
internal/query/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Structured query AST types + SQL builder with schema validation, permission injection, timestamp bucketing. Every column reference — projection, aggregation args, filters, group_by, order_by, time_range — is authorized inside query.Build (the single chokepoint)

Files:

  • internal/query/builder.go
  • internal/query/builder_test.go
🧠 Learnings (3)
📚 Learning: 2026-05-20T01:02:00.784Z
Learnt from: EricAndrechek
Repo: Wave-RF/WaveHouse PR: 164
File: internal/api/router_test.go:289-350
Timestamp: 2026-05-20T01:02:00.784Z
Learning: In WaveHouse’s internal API tests (files matching internal/api/**/*_test.go), follow the existing separation-of-concerns convention for testing the RequireRole middleware: inject `ContextKeyRole` directly into the request `context.Context` instead of using `testutil.MakeJWT`/JWT-driven flows. Do not refactor role-gate tests to use JWT tokens—JWT parsing and token handling are covered separately in `middleware_test.go` (the dedicated JWT parsing tests), and mixing those concerns would expand the failure surface and reduce isolation.

Applied to files:

  • internal/api/ch_settings_test.go
  • internal/api/structured_query_test.go
📚 Learning: 2026-05-23T01:23:59.268Z
Learnt from: EricAndrechek
Repo: Wave-RF/WaveHouse PR: 174
File: internal/api/ingest_test.go:111-111
Timestamp: 2026-05-23T01:23:59.268Z
Learning: In WaveHouse Go tests in internal/api/**/*_test.go, use internal/testutil.AssertJSONErrorResponse(t, w) for HTTP error-path JSON assertions. Do not use (or reintroduce) package-local assertJSONErrorResponse helpers. AssertJSONErrorResponse verifies the response Content-Type is application/json, includes the X-Content-Type-Options: nosniff header, and that the JSON body contains an "error" field.

Applied to files:

  • internal/api/ch_settings_test.go
  • internal/api/structured_query_test.go
📚 Learning: 2026-06-10T15:01:09.027Z
Learnt from: EricAndrechek
Repo: Wave-RF/WaveHouse PR: 312
File: docs/src/content/docs/development.md:0-0
Timestamp: 2026-06-10T15:01:09.027Z
Learning: In this repo’s Markdown review (all .md files), do not flag capitalization/style issues for literal paths starting with ".github/" (or any substring that is a path beginning with ".github/"). Treat ".github" as the correct lowercase dotfile directory name, even when it appears inside prose or code spans; automated checks such as LanguageTool’s "(GITHUB)" rule commonly produce false positives for this literal filesystem path.

Applied to files:

  • CHANGELOG.md
🪛 LanguageTool
docs/src/content/docs/access-control.mdx

[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ...et max_memory_usage as a size string ("4GiB", "512MiB" — IEC vs SI is respected,...

(UNIT_SPACE)


[typographical] ~307-~307: In American English, use a period after an abbreviation.
Context: ...size string ("4GiB", "512MiB" — IEC vs SI is respected, so "4GB" is 4×10⁹ an...

(MISSING_PERIOD_AFTER_ABBREVIATION)


[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ..."512MiB" — IEC vs SI is respected, so "4GB" is 4×10⁹ and "4GiB" is 4×2³⁰) or a ...

(UNIT_SPACE)


[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ...I is respected, so "4GB" is 4×10⁹ and "4GiB" is 4×2³⁰) or a bare number of **bytes...

(UNIT_SPACE)


[typographical] ~506-~506: Insert a space between the numerical value and the unit symbol.
Context: ...(ClickHouse max_memory_usage). Set as "4GiB"/"512MiB" or a number of bytes; retu...

(UNIT_SPACE)


[typographical] ~506-~506: Insert a space between the numerical value and the unit symbol.
Context: ...se max_memory_usage). Set as "4GiB"/"512MiB" or a number of bytes; returned as byt...

(UNIT_SPACE)


[style] ~506-~506: Consider shortening this phrase to avoid wordiness.
Context: ..._usage). Set as "4GiB"/"512MiB"or a number of bytes; returned as bytes.0` = no role...

(A_NUMBER_OF)

🔇 Additional comments (23)
clients/ts/src/types.ts (1)

215-231: LGTM!

docs/src/content/docs/access-control.mdx (6)

296-308: LGTM!


309-334: LGTM!


338-341: LGTM!


348-348: LGTM!

Also applies to: 352-352, 355-356


432-432: LGTM!

Also applies to: 461-461


503-506: LGTM!

docs/src/content/docs/configuration.mdx (4)

52-52: LGTM!


54-60: LGTM!


62-100: LGTM!


223-224: LGTM!

Also applies to: 292-292

CHANGELOG.md (1)

26-26: LGTM!

internal/policy/policy.go (1)

41-51: LGTM!

Also applies to: 74-76, 168-170, 436-443

internal/api/ch_settings_test.go (1)

1-117: LGTM!

internal/query/builder.go (1)

15-20: LGTM!

Also applies to: 48-48, 126-137

internal/query/builder_test.go (1)

31-875: LGTM!

tests/integration/identifier_roundtrip_test.go (1)

151-151: LGTM!

Also applies to: 185-185, 210-210, 235-235, 265-265, 271-271

internal/config/config.go (1)

32-46: LGTM!

Also applies to: 204-210

internal/config/config_test.go (1)

70-106: LGTM!

internal/api/structured_query.go (1)

11-12: LGTM!

Also applies to: 30-30, 48-48, 58-58, 117-117, 173-175, 180-197

internal/api/structured_query_test.go (1)

43-43: LGTM!

Also applies to: 295-295

cmd/wavehouse/main.go (1)

384-384: LGTM!

tests/integration/query_limits_test.go (1)

1-118: LGTM!

Comment thread config.yaml
Comment thread internal/policy/scalars_test.go
Comment thread internal/policy/scalars.go
Comment thread internal/policy/scalars.go
EricAndrechek and others added 3 commits June 11, 2026 15:16
…durations

Address CodeRabbit review on #335:

- scalars.go: reject non-scalar YAML nodes (mapping/sequence) for the
  Millis/ByteSize cap fields — their empty Value read as 0 and silently
  disabled the cap instead of failing validation.
- scalars.go: widen the sub-millisecond guard from `d > 0` to `d != 0`.
  Milliseconds() truncates toward zero, so "-500us" collapsed to 0 and
  slipped past negative-cap validation downstream.
- scalars_test.go: convert TestScalars_YAML to table-driven and pin
  regressions for both edge cases above.
- config.yaml: correct the default_max_rows comment — any non-positive
  value (not just 0) falls back to the built-in 10000.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address docs-reviewer findings on #335: this PR turned the result-LIMIT
cap into the tunable `query.default_max_rows` knob, so docs that still
described it as a hard-coded 10,000 `DefaultMaxRows` constant were stale.

- api.md: `limit` is capped at the configured `query.default_max_rows`.
- sdk/queries.md: same — server enforces the configured maximum.
- configuration.mdx: note a negative `default_max_rows` is rejected at startup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs-reviewer [MAY] on #335: the §Resource limits intro said `0`/unset
means only server-wide ClickHouse limits apply, but for `max_rows` the
`query.default_max_rows` result default (10,000) still clamps the LIMIT.
The field-reference row already stated this; make the intro agree.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 11, 2026
@EricAndrechek EricAndrechek marked this pull request as ready for review June 11, 2026 19:32
@EricAndrechek EricAndrechek requested a review from a team June 11, 2026 19:32
@EricAndrechek EricAndrechek requested a review from taitelee June 11, 2026 19:32
EricAndrechek and others added 3 commits June 12, 2026 08:14
…o comments

The IEC/SI byte-size note used Unicode superscript characters (4×10⁹,
4×2³⁰) that render inconsistently. Use proper <sup> tags in the rendered
MDX, and plain ASCII ^ notation in the Go doc comments (which are read as
source text, not HTML).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@taitelee taitelee moved this from In review to In progress in WaveHouse Task Board Jun 12, 2026
@taitelee taitelee self-assigned this Jun 12, 2026
@EricAndrechek EricAndrechek added this pull request to the merge queue Jun 12, 2026
Merged via the queue into main with commit 61cba87 Jun 12, 2026
20 checks passed
@EricAndrechek EricAndrechek deleted the ch-exec-limits branch June 12, 2026 18:10
@github-project-automation github-project-automation Bot moved this from In progress to Done in WaveHouse Task Board Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api HTTP handlers, routing, middleware area/docs Documentation, site/, README area/infra CI, build, deploy, Docker, release area/policy Access control policies (Hasura-style) area/query Structured query AST, SQL builder area/sdk TypeScript SDK (clients/ts/) dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation go Pull requests that update go code

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

security(query): ClickHouse resource caps not enforced server-side — send per-query Settings (max_execution_time/max_memory_usage/max_rows_to_read)

2 participants