fix(query): enforce ClickHouse resource caps server-side by EricAndrechek · Pull Request #335 · Wave-RF/WaveHouse

EricAndrechek · 2026-06-10T22:38:45Z

Summary

Closes #316. The native ClickHouse connection opened with no per-query Settings, so a public read's policy resource caps were only ever enforced client-side — a Go context deadline (which cancels only while the client is reading result blocks) plus a SQL LIMIT. Memory and rows scanned were never bounded server-side, so a heavy aggregation could allocate gigabytes of state or scan an entire table within the time budget; and clickhouse-go only derives max_execution_time from a context deadline > 1s, so a sub-second time cap reached ClickHouse with no server-side bound at all.

This attaches per-query clickhouse.Settings derived from the role's caps so a query can't outrun its budget during a server-side scan / merge / aggregation phase.

Where the line sits (WaveHouse vs ClickHouse)

The design draws a deliberate boundary:

WaveHouse owns the dynamic, per-role caps — they're derived from the JWT-resolved policy, which ClickHouse RBAC can't express without a credential per role, so WaveHouse sends them as per-query settings on its single connection.
ClickHouse owns the static, server-wide backstop — the limit that applies to every query regardless of role (named pipes and raw admin SQL included) lives in ClickHouse's own settings profiles and quotas. It enforces uniformly, composes with the per-role caps, and holds even against a WaveHouse bug. The docs explain how to configure it (with users.xml examples) and how the two layers compose (a readonly/<constraints> lock on WaveHouse's CH user will reject its per-query overrides).

So there's no WaveHouse global resource knob — just the per-role policy caps plus one query-shaping default (query.default_max_rows, the formerly hard-coded DefaultMaxRows = 10000, now tunable).

Enforcement

internal/api/ch_settings.go builds the per-query settings; wired into the structured-query handler:

Per-role cap	ClickHouse setting
`max_execution_time`	`max_execution_time` (fractional seconds — emitted explicitly so the sub-second case is enforced)
`max_rows`	`max_result_rows` + `result_overflow_mode=throw` (defense-in-depth behind the SQL `LIMIT`)
`max_rows_to_read`	`max_rows_to_read` + `read_overflow_mode=throw`
`max_memory_usage`	`max_memory_usage`

A query that exceeds its budget is rejected by the server (ClickHouse codes 158 / 241). Admin resolves to no caps → no settings sent (bounded only by ClickHouse's own config).

Policy field ergonomics (human-in / number-out)

The two human-scale fields accept a friendly string or a number on input and always return the canonical number on read-back, so SDKs never reimplement humanization:

max_execution_time — set as "5s"/"500ms" or a bare number of milliseconds; returned as ms.
max_memory_usage — set as "4GiB"/"512MiB" (IEC vs SI respected via dustin/go-humanize, so 4GB = 4×10⁹ ≠ 4GiB = 4×2³⁰) or a bare number of bytes; returned as bytes.

This is backed by two small Millis/ByteSize types in the policy package — no new package.

Tests

Unit: scalar round-trip (string|number in, number out, SI/IEC, sub-ms fail-closed); the cap→setting mapping; config parse + validation; the configurable builder default; policy Evaluate carry-through + negative validation.
Integration (real ClickHouse): a non-admin viewer with max_rows_to_read=1 / max_memory_usage=1 is rejected server-side (CH codes 158 / 241); the same-shape uncapped control returns all rows. Verified RED before the fix.
E2E (full SDK → WaveHouse → ClickHouse): the same enforcement on the public path.

Docs

access-control.mdx (per-role caps, human-in/number-out, "server-wide limits live in ClickHouse") and configuration.mdx (new "Server-side resource limits" section with ClickHouse profile/quota XML + the composition caution); config.yaml, the SDK RolePermissions type, and the CHANGELOG updated.

make ci green locally (unit + integration + e2e + coverage); pre-push code + docs reviewers: ship_it.

🤖 Generated with Claude Code

…-query Settings The native ClickHouse connection sent no per-query Settings, so a public structured read's policy caps only ever bound it client-side: a Go context deadline (which cancels while the client reads result blocks) plus a SQL LIMIT. Memory and rows scanned were never bounded server-side, so a heavy aggregation could allocate GBs of state or scan an entire table well within the time budget — and a sub-second max_execution_time_ms cap reached ClickHouse with no server-side bound at all (clickhouse-go only derives max_execution_time from context deadlines > 1s). Attach per-query Settings derived from the role's resolved permissions on the structured-query path so a query can't outrun its budget during a server-side scan / merge / aggregation phase: - max_execution_time <- max_execution_time_ms (fractional seconds; closes the sub-second gap the driver heuristic misses) - max_result_rows+throw <- max_rows (defense-in-depth behind the SQL LIMIT) - max_rows_to_read+throw<- new max_rows_to_read policy field (rows scanned) - max_memory_usage <- new max_memory_usage_bytes policy field (peak mem) A 0 cap means "no limit" and is omitted. The pipe path carries no per-role ResolvedPermissions and is left unchanged (separately tracked). Tests: - internal/api: unit test of the cap->setting mapping (chReadSettings) - internal/policy: Evaluate carry-through + negative validation - tests/integration: handler-level proof vs real ClickHouse — a viewer with max_rows_to_read=1 / max_memory_usage_bytes=1 is rejected (CH code 158 / 241); the uncapped control returns all rows. Verified RED before the fix. - tests/e2e: same enforcement through the full SDK -> WaveHouse -> ClickHouse path - docs: access-control.mdx resource-limits section + field table Closes #316 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-10T22:38:51Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 81ea88f6-d910-4043-9c50-176434dcdac5

📥 Commits

Reviewing files that changed from the base of the PR and between 168822e and 997e1b6.

📒 Files selected for processing (3)

docs/src/content/docs/access-control.mdx
internal/policy/policy.go
internal/policy/scalars.go

📜 Recent review details

⏰ Context from checks skipped due to timeout of 300000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Analyze (go)
GitHub Check: Coverage
GitHub Check: E2E tests

🧰 Additional context used

📓 Path-based instructions (3)

docs/src/content/docs/**/*.{md,mdx}

📄 CodeRabbit inference engine (AGENTS.md)

docs/src/content/docs/**/*.{md,mdx}: Author Mermaid diagrams vertically (flowchart TB/TD) to fit page column width (~46–58rem); reserve LR for genuinely short chains (≤3–4 nodes)
Keep Mermaid node labels short; use
for a second line rather than one long line; lean on semantic node classes (wh, win, pain, fail, infra, neutral, store, client)
Never sit two large diagrams side-by-side; wrap comparisons in
…
to stack them vertically

Files:

docs/src/content/docs/access-control.mdx

**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Use Go 1.26 with strict formatting enforced by gofumpt
Use structured logging with log/slog (JSON handler)
Use Chi v5 for HTTP routing
Return errors, don't panic. Wrap with fmt.Errorf("context: %w", err)
Use package naming: lowercase, single word (or abbreviated). internal/ enforces module privacy
No global state: Dependencies are passed explicitly (constructor injection)
Comment the why, not the what. Add a comment only when the reason isn't obvious from the code; a line that matches the surrounding pattern needs none. Keep comments to 1–2 lines
DRY — one source of truth. Before adding logic, look for an existing helper, type, or constant to reuse; before duplicating a rule, factor it into one place every caller reads
Leave it neater than you found it — within reason. Fix small, safe things in passing: a stale comment, an obvious typo, a misnamed local, dead code on your path

Files:

internal/policy/scalars.go
internal/policy/policy.go

internal/policy/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Hasura-style access control: Policy/TablePolicy/RolePermissions types, Evaluate() engine with JWT claim templating, NATS KV store (WAVEHOUSE_POLICY). policy.IsAdmin (role == admin_role, exact case-sensitive, default "admin") is the single admin check

Files:

internal/policy/scalars.go
internal/policy/policy.go

🪛 LanguageTool

docs/src/content/docs/access-control.mdx

[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ...et max_memory_usage as a size string ("4GiB", "512MiB" — IEC vs SI is respected,...

(UNIT_SPACE)

[typographical] ~307-~307: In American English, use a period after an abbreviation.
Context: ...size string ("4GiB", "512MiB" — IEC vs SI is respected, so "4GB" is 4×10<sup...

(MISSING_PERIOD_AFTER_ABBREVIATION)

[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ..."512MiB" — IEC vs SI is respected, so "4GB" is 4×10⁹ and "4GiB" is 4...

(UNIT_SPACE)

[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ...ted, so "4GB" is 4×10⁹ and "4GiB" is 4×2³⁰) or a bare number...

(UNIT_SPACE)

🔇 Additional comments (3)

internal/policy/scalars.go (1)

89-91: LGTM!

internal/policy/policy.go (1)

41-50: LGTM!

docs/src/content/docs/access-control.mdx (1)

307-307: LGTM!

📝 Walkthrough

Summary by CodeRabbit

New Features
- Server-side enforcement of per-role query resource limits: execution time, rows scanned, memory, and a configurable default max result rows (fallback 10,000).
Documentation
- Updated access-control, configuration, API, and SDK docs with new fields, formats, semantics, and enforcement caveats.
Tests
- Added/updated unit, integration, and end-to-end tests validating scalar parsing, settings translation, and ClickHouse-side enforcement.
Changelog
- Added Unreleased → Security entry describing the new caps and default-limit behavior.

Walkthrough

Derive per-query ClickHouse Settings from resolved role permissions (time, result rows, rows-to-read, memory), apply them to structured-query execution, add typed policy scalars and configurable query.default_max_rows, and update config, tests, and docs accordingly.

Changes

Server-side Query Resource Caps

Layer / File(s)	Summary
Policy scalars and limit schema `internal/policy/scalars.go`, `internal/policy/scalars_test.go`, `internal/policy/policy.go`, `internal/policy/policy_test.go`, `clients/ts/src/types.ts`	Adds `Millis` and `ByteSize` scalars; replaces `MaxExecutionTimeMs` with typed `MaxExecutionTime`, adds `MaxRowsToRead` and `MaxMemoryUsage`; updates evaluation and validation and tests; updates TS `RolePermissions`.
Configuration defaults and validation `internal/config/config.go`, `internal/config/config_test.go`, `config.yaml`	Adds `query.default_max_rows` with YAML/env wiring and default 10000; validates non-negative values; updates example config and tests.
ClickHouse settings builder `internal/api/ch_settings.go`, `internal/api/ch_settings_test.go`, `go.mod`	Adds `chQueryLimits` and `chReadSettings` to translate policy caps into ClickHouse Settings (`max_execution_time`, `max_result_rows` + overflow, `max_rows_to_read` + overflow, `max_memory_usage`); promotes `go-humanize` dependency.
Query builder with configurable default max rows `internal/query/builder.go`, `internal/query/builder_test.go`, `tests/integration/identifier_roundtrip_test.go`	`query.Build` gains `defaultMaxRows` parameter; LIMIT generation uses provided default with fallback behavior; tests updated to pass the parameter.
Structured query handler and execution wiring `internal/api/structured_query.go`, `internal/api/structured_query_test.go`, `cmd/wavehouse/main.go`	`NewStructuredQueryHandler` accepts/stores `defaultMaxRows`; handler builds `chQueryLimits` from resolved permissions, derives settings via `chReadSettings`, and applies them with `clickhouse.WithSettings` before executing SQL; switches to typed `MaxExecutionTime`.
Client types and documentation `clients/ts/src/types.ts`, `docs/src/content/docs/access-control.mdx`, `docs/src/content/docs/configuration.mdx`, `docs/src/content/docs/api.md`, `docs/src/content/docs/sdk/queries.md`, `CHANGELOG.md`	TS `RolePermissions` updated with new resource-limit fields; docs updated to describe formats, enforcement locations, and `query.default_max_rows`; changelog entry added.
Integration, E2E, and test updates `tests/integration/query_limits_test.go`, `tests/e2e/sdk/query.test.ts`, various builder/identifier tests	Adds integration test verifying server-side enforcement (ClickHouse error codes for row/memory violations); updates E2E to use new `max_execution_time` format and add regression tests for `max_rows_to_read` and `max_memory_usage`; updates many tests for new Build/handler signatures.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Wave-RF/WaveHouse#177: The main PR builds on the retrieved caching/timeout refactor of internal/api/structured_query.go by extending NewStructuredQueryHandler and the structured-query execution path to add defaultMaxRows and per-role ClickHouse Settings.
Wave-RF/WaveHouse#164: Overlaps on role-permissions/policy schema changes and ResolvedPermissions wiring.
Wave-RF/WaveHouse#172: Related edits to the RBAC/policy engine (Evaluate/validateRolePerms) that touch similar code paths.

Suggested reviewers

taitelee

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix(query): enforce ClickHouse resource caps server-side' accurately reflects the main objective—enforcing server-side resource limits via per-query ClickHouse settings.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, detailing the implementation of server-side resource cap enforcement, the boundary between WaveHouse and ClickHouse responsibilities, and testing/documentation coverage.
Linked Issues check	✅ Passed	The PR fully addresses issue `#316` by implementing per-query ClickHouse Settings enforcement for max_execution_time, max_memory_usage, max_rows_to_read, and max_result_rows, with appropriate overflow modes and tests confirming server-side rejection.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to `#316` requirements: schema updates (RolePermissions, policy scalars), new per-query settings builder (ch_settings.go), structured-query handler wiring, configuration for tunable default limits, comprehensive testing, and documentation updates.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ch-exec-limits

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch ch-exec-limits

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-10T22:41:12Z

📚 Docs preview is live → https://47c50767-wavehouse-docs.wave-rf.workers.dev

Commit — 997e1b6: Merge remote-tracking branch 'origin/ch-exec-limits' into ch-exec-limits
Author — @EricAndrechek
Committed — 2026-06-12 09:26 (UTC-04:00)
Deployed — 2026-06-12 09:33 EDT

github-code-quality · 2026-06-10T23:54:31Z

Code Coverage Overview

Languages: Go

Go

The overall coverage in the ch-exec-limits branch remains at 89%, unchanged from the main branch.

Show a code coverage summary of the most impacted files.

File	main `d216c97`	ch-exec-limits `997e1b6`	+/-
`internal/api/st...ctured_query.go`	98%	98%	0%
`internal/policy/policy.go`	98%	98%	0%
`internal/query/builder.go`	95%	95%	0%
`internal/config/config.go`	94%	94%	0%
`internal/api/cl...ckhouse_exec.go`	82%	83%	+1%
`internal/policy/scalars.go`	0%	87%	+87%
`internal/api/ch_settings.go`	0%	100%	+100%

_{Updated June 12, 2026 13:34 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.}

… defaults Refines the #316 resource-cap work (per review/design discussion) before the schema ships: Human-readable, config-consistent policy fields (pre-launch rename): - max_execution_time_ms (int) -> max_execution_time (Go duration string "5s") - max_memory_usage_bytes (int) -> max_memory_usage (size string "4GiB") - max_rows_to_read stays an int Backed by a new internal/units package: Duration + ByteSize, each with a single TextMarshaler/TextUnmarshaler pair that serves JSON, YAML, and env (encoding/json and yaml.v3 both route string scalars through the Text methods; cleanenv uses TextUnmarshaler). ByteSize parses binary RAM semantics via docker/go-units (already an indirect dep; now direct): "2G" == "2GB" == "2GiB". Server-wide defaults — new query_limits config block: - default_max_rows (10000) makes the formerly hard-coded query.DefaultMaxRows a documented, tunable knob (Build now takes it as a parameter) - default_max_rows_to_read (0 = off) and default_max_memory_usage (4GiB) Applied to every NON-ADMIN read on BOTH the structured-query and named-pipe paths (closing the pipe-path gap), enforced server-side. Precedence is most-specific-wins: a per-table-per-role policy cap overrides the default; admin bypasses these DoS backstops entirely. Tests: internal/units round-trip (JSON/YAML/env); config parse + validation; configurable builder default; chReadSettings + resolveReadBudget precedence; integration cases for the global default + admin bypass (structured) and the pipe path. Docs (access-control.mdx, configuration.mdx), config.yaml, SDK RolePermissions type, and CHANGELOG updated for the new schema. Refs #316 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

# Conflicts: # go.mod

Reworks the resource-cap design (per discussion) so the WaveHouse/ClickHouse boundary is principled: WaveHouse owns the DYNAMIC, per-role/per-claim caps (sent as per-query settings on its single connection — what ClickHouse RBAC can't express without a credential per role); the STATIC, server-wide backstop lives in ClickHouse's own settings profiles and quotas, where it enforces uniformly across every query (named pipes and raw admin SQL included) and composes with the per-role caps. Changes vs the prior iteration: - Drop the WaveHouse global query_limits DoS defaults (default_max_rows_to_read, default_max_memory_usage) and the admin-bypass/resolveReadBudget machinery. Config keeps only query.default_max_rows (the result-LIMIT pagination knob). - Revert the pipe-path per-query settings — pipes are governed by ClickHouse's server-wide config + query_timeout, not per-role policy. - Delete internal/units. The two value types move into the policy package (internal/policy/scalars.go: Millis, ByteSize) and switch to "human-in, number-out": UnmarshalJSON/YAML accept a string ("5s"/"4GiB") OR a bare number (ms/bytes); no Marshaler, so reads return the canonical number and SDKs never reimplement humanization. ByteSize parses via dustin/go-humanize, so 4GB (10^9) and 4GiB (2^30) differ correctly (go-units, which conflated them, is dropped back to indirect). - Docs: new "Server-side resource limits" section in configuration.mdx with ClickHouse settings-profile / quota XML examples and the readonly/constraints composition rule; access-control.mdx points global governance at ClickHouse and documents the human-in/number-out fields. Tests: policy scalar round-trip (string|number in, number out, SI/IEC), config query.default_max_rows, integration per-role enforcement (codes 158/241). SDK type, config.yaml, CHANGELOG updated. Refs #316 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 4

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 179baf7f-b359-4a16-a526-adf4a8c2ffa2

📥 Commits

Reviewing files that changed from the base of the PR and between 26b9866 and b50ee60.

📒 Files selected for processing (22)

CHANGELOG.md
clients/ts/src/types.ts
cmd/wavehouse/main.go
config.yaml
docs/src/content/docs/access-control.mdx
docs/src/content/docs/configuration.mdx
go.mod
internal/api/ch_settings.go
internal/api/ch_settings_test.go
internal/api/structured_query.go
internal/api/structured_query_test.go
internal/config/config.go
internal/config/config_test.go
internal/policy/policy.go
internal/policy/policy_test.go
internal/policy/scalars.go
internal/policy/scalars_test.go
internal/query/builder.go
internal/query/builder_test.go
tests/e2e/sdk/query.test.ts
tests/integration/identifier_roundtrip_test.go
tests/integration/query_limits_test.go

📜 Review details

⏰ Context from checks skipped due to timeout of 300000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Coverage

🧰 Additional context used

📓 Path-based instructions (10)

**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Use Go 1.26 with strict formatting enforced by gofumpt
Use structured logging with log/slog (JSON handler)
Use Chi v5 for HTTP routing
Return errors, don't panic. Wrap with fmt.Errorf("context: %w", err)
Use package naming: lowercase, single word (or abbreviated). internal/ enforces module privacy
No global state: Dependencies are passed explicitly (constructor injection)
Comment the why, not the what. Add a comment only when the reason isn't obvious from the code; a line that matches the surrounding pattern needs none. Keep comments to 1–2 lines
DRY — one source of truth. Before adding logic, look for an existing helper, type, or constant to reuse; before duplicating a rule, factor it into one place every caller reads
Leave it neater than you found it — within reason. Fix small, safe things in passing: a stale comment, an obvious typo, a misnamed local, dead code on your path

Files:

internal/api/ch_settings.go
internal/policy/scalars.go
internal/api/ch_settings_test.go
internal/api/structured_query_test.go
tests/integration/query_limits_test.go
cmd/wavehouse/main.go
internal/policy/policy_test.go
internal/policy/policy.go
internal/config/config_test.go
tests/integration/identifier_roundtrip_test.go
internal/config/config.go
internal/policy/scalars_test.go
internal/api/structured_query.go
internal/query/builder.go
internal/query/builder_test.go

internal/api/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Chi HTTP router, JWT/JWKS middleware (from auth/), ingest/query/structured-query/SSE/schema/DLQ/policy/pipes handlers, Hub

Files:

internal/api/ch_settings.go
internal/api/ch_settings_test.go
internal/api/structured_query_test.go
internal/api/structured_query.go

internal/policy/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Hasura-style access control: Policy/TablePolicy/RolePermissions types, Evaluate() engine with JWT claim templating, NATS KV store (WAVEHOUSE_POLICY). policy.IsAdmin (role == admin_role, exact case-sensitive, default "admin") is the single admin check

Files:

internal/policy/scalars.go
internal/policy/policy_test.go
internal/policy/policy.go
internal/policy/scalars_test.go

**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Use table-driven tests with tests := []struct{ name string; ... } and t.Run(tt.name, ...)
Use shared mocks from internal/testutil/ (MockPublisher, MockCache, MockDeduplicator, MockSubscriber) instead of creating ad-hoc mocks
Use testutil.MakeJWT(t, claims) and testutil.MakeExpiredJWT(t, claims) for auth tests
Use testutil.NewTestSchemaRegistry(tables) or discovery.NewSchemaRegistryFromMap(tables) for schema-aware tests
Use policy.NewMemoryStore(p) for in-memory policy testing without NATS
Use pipes.NewMemoryStore(queries...) for in-memory pipes testing without NATS
Use testutil.AssertJSONResponse(t, rec, status, expected) and testutil.AssertJSONContains(t, rec, status, substring) for response assertions

Files:

internal/api/ch_settings_test.go
internal/api/structured_query_test.go
tests/integration/query_limits_test.go
internal/policy/policy_test.go
internal/config/config_test.go
tests/integration/identifier_roundtrip_test.go
internal/policy/scalars_test.go
internal/query/builder_test.go

docs/src/content/docs/**/*.{md,mdx}

📄 CodeRabbit inference engine (AGENTS.md)

docs/src/content/docs/**/*.{md,mdx}: Author Mermaid diagrams vertically (flowchart TB/TD) to fit page column width (~46–58rem); reserve LR for genuinely short chains (≤3–4 nodes)
Keep Mermaid node labels short; use
for a second line rather than one long line; lean on semantic node classes (wh, win, pain, fail, infra, neutral, store, client)
Never sit two large diagrams side-by-side; wrap comparisons in
…
to stack them vertically

Files:

docs/src/content/docs/configuration.mdx
docs/src/content/docs/access-control.mdx

tests/integration/**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

Go integration tests (//go:build integration; ClickHouse testcontainer); run via make test-integration

Files:

tests/integration/query_limits_test.go
tests/integration/identifier_roundtrip_test.go

**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

TypeScript SDK (@wavehouse/sdk): zero-dep client, typed query builder, real-time SSE, live queries (incrementable/decomposable/poll aggregation), codegen CLI

Files:

clients/ts/src/types.ts
tests/e2e/sdk/query.test.ts

internal/config/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

YAML + env var config loading (cleanenv); add field with yaml, env, and env-default tags

Files:

internal/config/config_test.go
internal/config/config.go

tests/e2e/sdk/**/*.test.ts

📄 CodeRabbit inference engine (AGENTS.md)

tests/e2e/sdk/**/*.test.ts: E2E integration tests via TypeScript SDK (Vitest); each test file owns its own ClickHouse tables (clicks_/events_/users_); add suite name to SUITES in tables.ts and get names via const T = suiteTables("")
Test files run sequentially (maxWorkers: 1 in vitest.config.ts) due to shared global policy state; policy-mutating tests snapshot the full policy and restore it

Files:

tests/e2e/sdk/query.test.ts

internal/query/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Structured query AST types + SQL builder with schema validation, permission injection, timestamp bucketing. Every column reference — projection, aggregation args, filters, group_by, order_by, time_range — is authorized inside query.Build (the single chokepoint)

Files:

internal/query/builder.go
internal/query/builder_test.go

🧠 Learnings (3)

📚 Learning: 2026-05-20T01:02:00.784Z

Learnt from: EricAndrechek
Repo: Wave-RF/WaveHouse PR: 164
File: internal/api/router_test.go:289-350
Timestamp: 2026-05-20T01:02:00.784Z
Learning: In WaveHouse’s internal API tests (files matching internal/api/**/*_test.go), follow the existing separation-of-concerns convention for testing the RequireRole middleware: inject `ContextKeyRole` directly into the request `context.Context` instead of using `testutil.MakeJWT`/JWT-driven flows. Do not refactor role-gate tests to use JWT tokens—JWT parsing and token handling are covered separately in `middleware_test.go` (the dedicated JWT parsing tests), and mixing those concerns would expand the failure surface and reduce isolation.

Applied to files:

internal/api/ch_settings_test.go
internal/api/structured_query_test.go

📚 Learning: 2026-05-23T01:23:59.268Z

Learnt from: EricAndrechek
Repo: Wave-RF/WaveHouse PR: 174
File: internal/api/ingest_test.go:111-111
Timestamp: 2026-05-23T01:23:59.268Z
Learning: In WaveHouse Go tests in internal/api/**/*_test.go, use internal/testutil.AssertJSONErrorResponse(t, w) for HTTP error-path JSON assertions. Do not use (or reintroduce) package-local assertJSONErrorResponse helpers. AssertJSONErrorResponse verifies the response Content-Type is application/json, includes the X-Content-Type-Options: nosniff header, and that the JSON body contains an "error" field.

Applied to files:

internal/api/ch_settings_test.go
internal/api/structured_query_test.go

📚 Learning: 2026-06-10T15:01:09.027Z

Learnt from: EricAndrechek
Repo: Wave-RF/WaveHouse PR: 312
File: docs/src/content/docs/development.md:0-0
Timestamp: 2026-06-10T15:01:09.027Z
Learning: In this repo’s Markdown review (all .md files), do not flag capitalization/style issues for literal paths starting with ".github/" (or any substring that is a path beginning with ".github/"). Treat ".github" as the correct lowercase dotfile directory name, even when it appears inside prose or code spans; automated checks such as LanguageTool’s "(GITHUB)" rule commonly produce false positives for this literal filesystem path.

Applied to files:

CHANGELOG.md

🪛 LanguageTool

docs/src/content/docs/access-control.mdx

[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ...et max_memory_usage as a size string ("4GiB", "512MiB" — IEC vs SI is respected,...

(UNIT_SPACE)

[typographical] ~307-~307: In American English, use a period after an abbreviation.
Context: ...size string ("4GiB", "512MiB" — IEC vs SI is respected, so "4GB" is 4×10⁹ an...

(MISSING_PERIOD_AFTER_ABBREVIATION)

[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ..."512MiB" — IEC vs SI is respected, so "4GB" is 4×10⁹ and "4GiB" is 4×2³⁰) or a ...

(UNIT_SPACE)

[typographical] ~307-~307: Insert a space between the numerical value and the unit symbol.
Context: ...I is respected, so "4GB" is 4×10⁹ and "4GiB" is 4×2³⁰) or a bare number of **bytes...

(UNIT_SPACE)

[typographical] ~506-~506: Insert a space between the numerical value and the unit symbol.
Context: ...(ClickHouse max_memory_usage). Set as "4GiB"/"512MiB" or a number of bytes; retu...

(UNIT_SPACE)

[typographical] ~506-~506: Insert a space between the numerical value and the unit symbol.
Context: ...se max_memory_usage). Set as "4GiB"/"512MiB" or a number of bytes; returned as byt...

(UNIT_SPACE)

[style] ~506-~506: Consider shortening this phrase to avoid wordiness.
Context: ..._usage). Set as "4GiB"/"512MiB"or a number of bytes; returned as bytes.0` = no role...

(A_NUMBER_OF)

🔇 Additional comments (23)

clients/ts/src/types.ts (1)

215-231: LGTM!

docs/src/content/docs/access-control.mdx (6)

296-308: LGTM!

309-334: LGTM!

338-341: LGTM!

348-348: LGTM!

Also applies to: 352-352, 355-356

432-432: LGTM!

Also applies to: 461-461

503-506: LGTM!

docs/src/content/docs/configuration.mdx (4)

52-52: LGTM!

54-60: LGTM!

62-100: LGTM!

223-224: LGTM!

Also applies to: 292-292

CHANGELOG.md (1)

26-26: LGTM!

internal/policy/policy.go (1)

41-51: LGTM!

Also applies to: 74-76, 168-170, 436-443

internal/api/ch_settings_test.go (1)

1-117: LGTM!

internal/query/builder.go (1)

15-20: LGTM!

Also applies to: 48-48, 126-137

internal/query/builder_test.go (1)

31-875: LGTM!

tests/integration/identifier_roundtrip_test.go (1)

151-151: LGTM!

Also applies to: 185-185, 210-210, 235-235, 265-265, 271-271

internal/config/config.go (1)

32-46: LGTM!

Also applies to: 204-210

internal/config/config_test.go (1)

70-106: LGTM!

internal/api/structured_query.go (1)

11-12: LGTM!

Also applies to: 30-30, 48-48, 58-58, 117-117, 173-175, 180-197

internal/api/structured_query_test.go (1)

43-43: LGTM!

Also applies to: 295-295

cmd/wavehouse/main.go (1)

384-384: LGTM!

tests/integration/query_limits_test.go (1)

1-118: LGTM!

…durations Address CodeRabbit review on #335: - scalars.go: reject non-scalar YAML nodes (mapping/sequence) for the Millis/ByteSize cap fields — their empty Value read as 0 and silently disabled the cap instead of failing validation. - scalars.go: widen the sub-millisecond guard from `d > 0` to `d != 0`. Milliseconds() truncates toward zero, so "-500us" collapsed to 0 and slipped past negative-cap validation downstream. - scalars_test.go: convert TestScalars_YAML to table-driven and pin regressions for both edge cases above. - config.yaml: correct the default_max_rows comment — any non-positive value (not just 0) falls back to the built-in 10000. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Address docs-reviewer findings on #335: this PR turned the result-LIMIT cap into the tunable `query.default_max_rows` knob, so docs that still described it as a hard-coded 10,000 `DefaultMaxRows` constant were stale. - api.md: `limit` is capped at the configured `query.default_max_rows`. - sdk/queries.md: same — server enforces the configured maximum. - configuration.mdx: note a negative `default_max_rows` is rejected at startup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

docs-reviewer [MAY] on #335: the §Resource limits intro said `0`/unset means only server-wide ClickHouse limits apply, but for `max_rows` the `query.default_max_rows` result default (10,000) still clamps the LIMIT. The field-reference row already stated this; make the intro agree. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…o comments The IEC/SI byte-size note used Unicode superscript characters (4×10⁹, 4×2³⁰) that render inconsistently. Use proper <sup> tags in the rendered MDX, and plain ASCII ^ notation in the Go doc comments (which are read as source text, not HTML). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-project-automation Bot added this to WaveHouse Task Board Jun 10, 2026

github-project-automation Bot moved this to Backlog in WaveHouse Task Board Jun 10, 2026

EricAndrechek moved this from Backlog to In progress in WaveHouse Task Board Jun 10, 2026

EricAndrechek moved this from In progress to In review in WaveHouse Task Board Jun 10, 2026

Merge branch 'main' into ch-exec-limits

c2ceffb

EricAndrechek and others added 4 commits June 10, 2026 20:05

Merge branch 'main' into ch-exec-limits

8e2fe8e

Merge branch 'main' into ch-exec-limits

9871844

Merge remote-tracking branch 'origin/ch-exec-limits' into ch-exec-limits

465ca40

# Conflicts: # go.mod

github-actions Bot added dependencies Pull requests that update a dependency file area/query Structured query AST, SQL builder area/infra CI, build, deploy, Docker, release labels Jun 11, 2026

coderabbitai Bot requested changes Jun 11, 2026

View reviewed changes

Comment thread config.yaml

Comment thread internal/policy/scalars_test.go

Comment thread internal/policy/scalars.go

Comment thread internal/policy/scalars.go

EricAndrechek and others added 3 commits June 11, 2026 15:16

coderabbitai Bot previously approved these changes Jun 11, 2026

View reviewed changes

EricAndrechek marked this pull request as ready for review June 11, 2026 19:32

EricAndrechek requested a review from a team June 11, 2026 19:32

EricAndrechek requested a review from taitelee June 11, 2026 19:32

EricAndrechek and others added 3 commits June 12, 2026 08:14

Merge branch 'main' into ch-exec-limits

168822e

Merge remote-tracking branch 'origin/ch-exec-limits' into ch-exec-limits

997e1b6

EricAndrechek dismissed coderabbitai[bot]’s stale review via 997e1b6 June 12, 2026 13:31

coderabbitai Bot approved these changes Jun 12, 2026

View reviewed changes

taitelee moved this from In review to In progress in WaveHouse Task Board Jun 12, 2026

taitelee self-assigned this Jun 12, 2026

taitelee approved these changes Jun 12, 2026

View reviewed changes

EricAndrechek added this pull request to the merge queue Jun 12, 2026

Merged via the queue into main with commit 61cba87 Jun 12, 2026
20 checks passed

EricAndrechek deleted the ch-exec-limits branch June 12, 2026 18:10

github-project-automation Bot moved this from In progress to Done in WaveHouse Task Board Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(query): enforce ClickHouse resource caps server-side#335

fix(query): enforce ClickHouse resource caps server-side#335
EricAndrechek merged 13 commits into
mainfrom
ch-exec-limits

EricAndrechek commented Jun 10, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

github-code-quality Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EricAndrechek commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Where the line sits (WaveHouse vs ClickHouse)

Enforcement

Policy field ergonomics (human-in / number-out)

Tests

Docs

Uh oh!

coderabbitai Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-code-quality Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Overview

Go

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EricAndrechek commented Jun 10, 2026 •

edited

Loading

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

github-actions Bot commented Jun 10, 2026 •

edited

Loading

github-code-quality Bot commented Jun 10, 2026 •

edited

Loading