Skip to content

Experimental postgres query command (PR 1/4: scaffold)#5135

Merged
simonfaltum merged 12 commits intomainfrom
simonfaltum/postgres-query-pr1-scaffold
May 5, 2026
Merged

Experimental postgres query command (PR 1/4: scaffold)#5135
simonfaltum merged 12 commits intomainfrom
simonfaltum/postgres-query-pr1-scaffold

Conversation

@simonfaltum
Copy link
Copy Markdown
Member

@simonfaltum simonfaltum commented Apr 30, 2026

PR Stack

  1. PR 1 (this PR)#5135 — scaffold + autoscaling targeting + text output
  2. #5136 — PR 2: provisioned + JSON/CSV streaming + typed values
  3. #5138 — PR 3: multi-input + multi-statement rejection + error formatting
  4. #5143 — PR 4: cancellation + timeout + TUI

Why

Talking to Lakebase Postgres from a script today goes through one of two unpleasant paths:

  1. Shell out to databricks psql -- -c "SQL". Works on macOS / Linux when psql is installed. Fails on Windows 11 by default and on minimal containers / sandboxed CI. No JSON / CSV.
  2. Write Python with psycopg. Forces every consumer to manage OAuth tokens, SSL mode, autocommit, etc.

This series adds a third path: a native CLI command that runs SQL against any Lakebase endpoint, returns results in text/JSON/CSV (later PRs), and works without a system psql.

databricks psql keeps owning the interactive REPL surface; this PR does not touch psql.

Changes

Before: No CLI command runs SQL against Lakebase from Go. Users either shell out to psql (requires the system binary) or write psycopg glue.

Now: databricks experimental postgres query --target projects/foo/branches/main/endpoints/primary "SELECT 1" returns a text-rendered result. Provisioned instances and richer output formats land in follow-up PRs.

The experimental command is fully contained under experimental/postgres/cmd/:

  • experimental/postgres/cmd/cmd.go, query.go, targeting.go, connect.go, execute.go, render.go — command implementation.
  • experimental/postgres/cmd/internal/target/ — Lakebase target resolution helpers (parsing, SDK wrappers, auto-select-when-exactly-one). Internal sub-package so it can't accidentally be imported from outside the experiment. When/if this command graduates from experimental, that's the right time to consider extracting to libs/.

Single positional SQL, autoscaling targeting only (--target, --project, --branch, --endpoint), --max-retries, --connect-timeout, --database. Driver is github.com/jackc/pgx/v5 v5.9.1 (MIT). Connect retry uses a typed predicate (08xxx SQLSTATE family + 57P03 cannot_connect_now + net.OpError with Op == "dial"); auth (28xxx) and permission (42501) errors do not retry. Text rendering is buffered (no streaming yet); rows-producing vs command-only is decided at runtime via FieldDescriptions().

Outside the experimental tree, this PR only:

  • Registers the command in cmd/experimental/experimental.go (2 lines).
  • Adds the pgx direct dependency (go.mod SPDX annotation, NOTICE entry, NEXT_CHANGELOG.md dependency-updates entry).

pgx is already a direct dep of the universe monorepo's Lakebase services; aligning here keeps the SDK surface consistent.

Test plan

  • go test ./experimental/postgres/... (target parsing, validateTargeting, retry classification, render)
  • go test ./internal/build/... (license + NOTICE completeness)
  • go tool ... golangci-lint run ./experimental/postgres/... (0 issues)
  • ./task checks (whitespace, links, deadcode)

Move the Postgres autoscaling and provisioned target-resolution helpers
out of cmd/psql/ into a shared package so a second consumer (the new
experimental postgres query command, in a follow-up commit) can reuse
the same SDK shapes. cmd/psql keeps its interactive UX by wrapping the
shared AutoSelect* helpers with errors.As fallbacks on AmbiguousError.

No behavior change for cmd/psql; existing acceptance tests pass.

Co-authored-by: Isaac
Scaffolds 'databricks experimental postgres query', a scriptable SQL
runner against a Lakebase Postgres autoscaling endpoint that does not
require a system psql binary.

This PR ships the smallest useful slice:

  - Single positional SQL statement.
  - --target (autoscaling resource path), --project, --branch,
    --endpoint targeting forms; provisioned-shaped targets return a
    pointer at 'databricks psql' for now.
  - Connect retry on idle/waking endpoints (08xxx SQLSTATE family,
    dial errors).
  - Text output (static table for rows-producing statements, command
    tag for command-only).

Provisioned support, JSON/CSV streaming output, multi-statement input,
cancellation, and integration tests come in follow-up PRs.

Driver: github.com/jackc/pgx/v5 v5.9.1 (MIT). Already a direct dep of
the universe monorepo's Lakebase services; aligning here keeps the SDK
surface consistent.

Co-authored-by: Isaac
- Replace exported PathSegmentProjects/ExtractID with focused helpers
  (ProjectIDFromName/BranchIDFromName/EndpointIDFromName); keeps SDK
  literals out of call sites.
- Type AmbiguousError.Kind as a typed enum (KindProject/Branch/Endpoint/Instance)
  so producers and the pluralisation switch stay in sync.
- Stop setting Choice.DisplayName when it equals the ID; Error() relies on
  empty-suppression rather than mixed empty/equal-to-ID checks.
- Add 57P03 (cannot_connect_now) to the connect-retry allow-list. Postgres
  emits this during server startup and Lakebase autoscaling can plausibly
  return it during the wake-up handshake. Tests exercise 57P03/57P01/57014
  to lock the boundary.
- Require --branch when --endpoint is set. The auto-select-then-look-up
  flow produces confusing errors when the auto-selected branch does not
  contain the requested endpoint, and this command is non-interactive so
  asking the user to be explicit is friendlier.
- Reject --max-retries < 1 explicitly instead of silently clamping. Help
  text already advertised the constraint; matching it at validation time
  is consistent with the repo's "reject incompatible inputs early" rule.
- Harmonise the "endpoint is not ready" error in cmd/psql to include the
  state, matching the experimental command and giving operators something
  to act on.
- Restore comments removed during the cmd/psql refactor and add a
  breadcrumb at the GetProvisioned call site about the Name patch.
- Add doc comments to AutoSelect* helpers documenting the returned
  string shape (trailing ID for autoscaling vs full name for provisioned).
- Reject trailing components after endpoint in ParseAutoscalingPath; new
  acceptance test in cmd/psql exercises this.
- Drop dead GroupID: "" assignment.

Co-authored-by: Isaac
- Fix selectAmbiguous: fall back to ID when DisplayName is empty.
  Round-1 fix to Choice semantics left producers emitting empty
  DisplayName for branches/endpoints/instances; the psql interactive
  selector passed that straight to cmdio.Tuple.Name and rendered
  blank rows. Add the documented fallback.
- Drop unused BranchIDFromName / EndpointIDFromName exports; only
  ProjectIDFromName has callers in this PR. Re-add when first consumed.
- Convert chained ifs in isRetryableConnectError to a switch.

Co-authored-by: Isaac
…imental

Two reductions for an experimental command, per a maintainer comment:

- Move libs/lakebase/target into experimental/postgres/cmd/internal/target
  so the experiment is self-contained. cmd/psql is no longer touched (no
  refactor, no behavior change). When/if this command graduates from
  experimental, that's the right time to extract the shared package.
- Drop acceptance tests for the new command. Aitools (the other
  experimental command) has none either; locking down user-visible
  wording for an experimental surface is overinvestment. Unit tests
  still cover argument validation, retry classification, and rendering.
  Acceptance tests can be added when the command graduates.

Net diff on cmd/psql is now zero. The experiment lives entirely under
experimental/postgres/cmd/.

Co-authored-by: Isaac
@arsenyinfo
Copy link
Copy Markdown
Contributor

Tab and newline characters in cell values corrupt text-mode output

  • Priority: P2
  • Location: experimental/postgres/cmd/execute.go:52, experimental/postgres/cmd/render.go:51
  • Scenario: Any query returning a text column whose value contains a tab, newline, or carriage return — e.g., SELECT E'hello\tworld'. executeOne stores the raw bytes as a string (row[i] = string(b)) with no sanitization; renderText then joins cells with a literal tab (strings.Join(row, "\t")) and writes into a tabwriter. An embedded tab is treated as an extra column boundary, shifting all subsequent columns; an embedded newline splits one logical row across multiple output lines.
  • Potential solution: Before passing cell contents to the tabwriter, escape control characters — replace literal \t with the two-character sequence \t, \n with \n, \r with \r — matching how psql renders them. Add a render test covering these inputs, as render_test.go:11-67 currently has no such coverage.

🔍 Reviewed by nitpicker

This PR only uses autoscaling targeting; provisioned helpers in
internal/target/provisioned.go have no caller in PR 1's net diff,
which the task deadcode check (run by CI's lint job, not lint-q)
correctly flags.

Provisioned support lands in PR 2; the necessary subset of helpers
(GetProvisioned, ProvisionedCredential) is added there alongside the
first caller.

Co-authored-by: Isaac
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 1, 2026 07:01 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 1, 2026 07:01 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 4, 2026 11:50 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 4, 2026 11:50 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 4, 2026 13:13 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 4, 2026 13:13 — with GitHub Actions Inactive
pgx.ParseConfig with an empty host falls back to a unix-socket path and
sets TLSConfig=nil. Patching Host after the parse leaves TLSConfig nil,
so the connection goes plaintext and Lakebase rejects the pgwire startup
("Invalid protocol version: 196608").

Build the DSN with the real host so pgx derives TLSConfig correctly, and
keep user/password/connect-timeout as field patches.

Co-authored-by: Isaac
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 5, 2026 09:09 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 5, 2026 09:09 — with GitHub Actions Inactive
The golangci-lint nosprintfhostport check flags fmt.Sprintf with %s:%d
for host:port in URLs. Switch to net.JoinHostPort.

Co-authored-by: Isaac
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 5, 2026 11:14 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 5, 2026 11:14 — with GitHub Actions Inactive
The previous "Connecting to ..." line went to stderr but stayed in the
terminal forever, even after results arrived. Use cmdio.NewSpinner so the
status disappears once the connection succeeds.

Co-authored-by: Isaac
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 5, 2026 12:37 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 5, 2026 12:37 — with GitHub Actions Inactive
@simonfaltum simonfaltum merged commit fadaebc into main May 5, 2026
27 of 28 checks passed
@simonfaltum simonfaltum deleted the simonfaltum/postgres-query-pr1-scaffold branch May 5, 2026 14:57
simonfaltum added a commit that referenced this pull request May 5, 2026
…g + types (#5136)

## PR Stack

1. [#5135](#5135) — PR 1: scaffold
+ autoscaling targeting + text output
2. **PR 2 (this PR)** —
[#5136](#5136) — provisioned +
JSON/CSV streaming + typed values + `experimental/libs/sqlcli` for
output handling
3. [#5138](#5138) — PR 3:
multi-input + multi-statement rejection + error formatting
4. [#5143](#5143) — PR 4:
cancellation + timeout + TUI

Stacked on PR 1.

## Why

Two things in this PR. The user-facing one: postgres query learns
JSON/CSV streaming and provisioned-instance support. The architectural
one: aitools query and postgres query had near-identical output-mode
handling (same env var, same flag/env precedence, same threshold).
Promote the duplication to a shared `experimental/libs/sqlcli` package
before the second consumer ossifies the divergence.

## Changes

**Architectural:** `experimental/libs/sqlcli/` is a new package under
`experimental/libs/` (not `libs/`) so it inherits the
experimental-stability guarantee of its consumers. Exposes:

- `sqlcli.EnvOutputFormat`, `sqlcli.StaticTableThreshold` constants.
- `sqlcli.Format` typedef + `sqlcli.OutputText/JSON/CSV` consts +
`sqlcli.AllFormats`.
- `sqlcli.ResolveFormat` — flag > env > default precedence with the
explicit-text-on-pipe-is-honoured rule.

aitools query migrates to use sqlcli (pure refactor, no behavior
change). postgres query was about to add its own copy of all of this;
instead it uses sqlcli from day one.

**User-facing changes for postgres query:**

- `--target my-instance` now resolves a provisioned instance.
- `--output json` streams typed values: numbers stay numeric, jsonb
stays nested, NaN/Inf/bigints-outside-2^53 become strings.
- `--output csv` streams (no buffering).
- `DATABRICKS_OUTPUT_FORMAT` honoured.
- Auto-fallback to JSON when stdout is piped.
- Duplicate column names get deterministic `__N` suffixes with a stderr
warning.

Also adds `cmdio.IsOutputTTY` (a small public wrapper around the
existing private `isTTY`) so commands can ask "is stdout a terminal?"
without folding in `NO_COLOR` / `TERM=dumb` (both of which
`cmdio.SupportsColor` ANDs in for the colour-rendering decision).

## Test plan

- [x] `go test ./experimental/aitools/... ./experimental/postgres/...
./experimental/libs/...` (unit, sinks, value mapping, format selection,
aitools tests still pass after migration)
- [x] `go tool ... golangci-lint run ./experimental/...` (0 issues)
simonfaltum added a commit that referenced this pull request May 5, 2026
…5138)

## PR Stack

1. [#5135](#5135) — PR 1: scaffold
+ autoscaling targeting + text output
2. [#5136](#5136) — PR 2:
provisioned + JSON/CSV streaming + types + `sqlcli.ResolveFormat`
3. **PR 3 (this PR)** —
[#5138](#5138) — multi-input +
multi-statement rejection + error formatting + `sqlcli.Collect`
4. [#5143](#5143) — PR 4:
cancellation + timeout + TUI

Stacked on PR 2.

## Why

PR 2 shipped a single-statement, single-input command. Real workflows
want multi-input (set-then-query, file-then-stdin), multi-statement
rejection with a friendly hint, and rich pg error formatting.

This PR also extends `experimental/libs/sqlcli` with input-collection
logic shared by aitools and postgres. Same architectural principle as PR
2: instead of postgres growing its own duplicate of aitools'
resolveSQLs, both commands now call `sqlcli.Collect`.

## Changes

**Architectural:** `experimental/libs/sqlcli/input.go` adds:

- `sqlcli.SQLFileExtension` const (.sql).
- `sqlcli.Input{SQL, Source}` type — Source is the human-readable origin
label ("--file PATH", "argv[N]", "stdin").
- `sqlcli.CollectOptions{Cleaner func(string) string}` — aitools passes
its `cleanSQL` (strips comments+quotes); postgres passes the default
`TrimSpace` because its multi-statement scanner needs comments
preserved.
- `sqlcli.Collect` — files-first then positionals, stdin only when
neither is present, .sql autodetect on positionals.

aitools' resolveSQLs collapses to a thin wrapper around sqlcli.Collect
(drops the SQL strings, ignores Source). The "SQL statement #N is empty
after removing comments" wording is replaced with sqlcli's `argv[N] is
empty`; aitools tests updated.

**User-facing changes for postgres query:**

- Variadic positionals + repeatable `--file` + stdin fallback.
- Multi-statement strings rejected up front with a hint (the
hand-written conservative scanner ignores `;` inside string literals,
identifiers, line/block comments, and dollar-quoted bodies; tag must be
a valid unquoted identifier so `$1` and `$foo-bar$` are correctly NOT
treated as tags).
- Multi-input output: per-unit blocks for text; canonical-shape JSON
array `{"source","sql","kind","elapsed_ms",...}` for json; csv rejected
pre-flight when N>1.
- Rich pg error formatting (`SEVERITY: message (SQLSTATE XXXXX)` with
DETAIL/HINT lines), applied on both single-input and multi-input paths.

Single-input keeps streaming. `runUnitBuffered` is a thin wrapper around
`executeOne` + a `bufferSink`, so the row-loop and error-wrapping logic
stays in one place.

## Test plan

- [x] `go test ./experimental/...` (multistatement scanner: 28 cases
including dollar-tag punctuation rejection, sqlcli.Collect: 12 cases
including a custom-cleaner test, error formatting, multi-input renderers
including byte-equal canonical-shape and first-unit-fails framing)
- [x] `go tool ... golangci-lint run ./experimental/...` (0 issues)
simonfaltum added a commit that referenced this pull request May 6, 2026
## PR Stack

1. [#5135](#5135) — PR 1: scaffold
+ autoscaling targeting + text output
2. [#5136](#5136) — PR 2:
provisioned + JSON/CSV streaming + types
3. [#5138](#5138) — PR 3:
multi-input + multi-statement rejection + error formatting
4. **PR 4 (this PR)** —
[#5143](#5143) — cancellation +
timeout + TUI for >30 rows

Stacked on PR 3.

## Why

PR 3 finished the input ergonomics. The remaining commitments before
this command earns the "experimental" label:

- A long SELECT shouldn't survive Ctrl+C. Today the pgx default closes
the TCP socket but leaves the server-side query running.
- CI scripts want `--timeout` so a single statement can't pin a runner.
- Interactive users with >30 rows benefit from a scrollable browser
instead of a wall of text.

## Changes

**Before:** Ctrl+C tears down TCP but the query runs to completion
server-side. `--timeout` doesn't exist. >30 rows scroll past in the
terminal.

**Now:** Ctrl+C cancels the in-flight query at the server. `--timeout
30s` enforces a per-statement deadline. >30 rows on a TTY open the
libs/tableview viewer.

Specifically:

- **Cancellation watcher.** `buildPgxConfig` now installs
`CancelRequestContextWatcherHandler` with `CancelRequestDelay=0,
DeadlineDelay=5s`. Zero `DeadlineDelay` would race the cancel-request
and could leave the connection unusable; 5s gives the cancel round-trip
time to land.
- **Signal handler.** Per-invocation goroutine watches SIGINT and
SIGTERM. Calls cancel; defer'd stop drains the channel.
- **--timeout flag.** Per-statement `context.WithTimeout` child of the
signal-scoped ctx. `reportCancellation` distinguishes Ctrl+C ("Query
cancelled."), timeout ("Query timed out after Xs."), and plain query
errors. Returns `(msg, invocationScoped)` so the multi-input loop can
drop the source prefix on user-cancel.
- **TUI for >30 rows.** `textSink` now has an `interactive` mode;
`runQuery` enables it when stdout is a prompt-capable TTY. Static
tabwriter table for small results and pipes; libs/tableview for big
interactive ones. If `tableview.Run` fails (TUI startup, terminal resize
race) the sink falls through to the static tabwriter path so the user
still sees the rows.

Integration tests aren't included: aitools (the other experimental
command) doesn't have them either. Acceptance + unit tests cover
argument validation, targeting resolution (SDK-mocked), and output
shapes; cancellation/timeout are unit-tested via the seam in
`cancel_test.go`. Real-wire integration tests are the right addition
when this command graduates from experimental.

## Test plan

- [x] `go test ./experimental/postgres/...` (cancel/timeout/signal
helpers, race-precedence pinning)
- [x] `go test ./acceptance -run
TestAccept/cmd/(psql|experimental/postgres)` (no regressions)
- [x] `go tool ... golangci-lint run ./experimental/postgres/...` (0
issues)
- [x] Manual: Ctrl+C during `SELECT pg_sleep(60)` cancels the
server-side query within ~5s. (Smoked on
`chatbot-lakebase-dev-simon-faltum` (e2-dogfood): SIGINT to `SELECT
pg_sleep(60)` exited the client in 0.53s with `Error: Query cancelled.`;
subsequent `pg_stat_activity` query returned zero rows.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants