Skip to content

fix(activation): always INSTALL iceberg even when bundled#646

Merged
benben merged 1 commit into
mainfrom
ben/iceberg-install-bundled
Jun 1, 2026
Merged

fix(activation): always INSTALL iceberg even when bundled#646
benben merged 1 commit into
mainfrom
ben/iceberg-install-bundled

Conversation

@benben
Copy link
Copy Markdown
Member

@benben benben commented Jun 1, 2026

Problem

After #645 bundled the iceberg extension into the worker image, iceberg-only tenant activation still hangs past the activate-tenant deadline. The bundle is correct — /data/extensions/v1.5.3/linux_arm64/iceberg.duckdb_extension is in place (the bootstrap seeded it from /app/extensions/...) — but db.Exec("LOAD iceberg") blocks for the full ~60s with no progress and no DuckDB-level log, until the CP cancels and the worker is retired.

Why: LoadExtensions skips INSTALL whenever hasBundledExtensionBinary returns true:

if shouldInstallExtension(name) { db.Exec("INSTALL " + installCmd) }
db.Exec("LOAD " + name)

That works for httpfs, ducklake, json, postgres_scanner — LOAD against the seeded extension_directory file just works. It does not work for iceberg: without a prior INSTALL, DuckDB's internal extension metadata leaves LOAD iceberg blocked indefinitely, even with the binary already in the cache. So the very INSTALL the bundle was meant to spare us turns into the same silent stall when the bundle deletes the call.

Live evidence on mw-dev (ben-ext-ice, build 2f2f38d = #645):

step=count-catalogs       elapsed=172ns
step=load-iceberg-extension elapsed=1.5ms
… (no further activity for ~60s) …
worker shutting down

Fix

Special-case iceberg in shouldInstallExtension so INSTALL runs even when the binary is bundled. With the file already in extension_directory, INSTALL is a cheap no-op — DuckDB sees the cached file and skips the CDN download — so the bundle benefit is preserved (no first-load fetch from the CDN) while LOAD now finds the extension already installed.

The upstream-overwrite risk that motivates skipping INSTALL for the PostHog forks (httpfs, ducklake) doesn't apply: we bundle the same stock iceberg build the DuckDB repository would serve.

Test plan (mw-dev after deploy)

  • ben-ext-ice activates within the deadline.
  • Iceberg-only SELECT 1 succeeds; CREATE TABLE iceberg.public.<t> / INSERT / SELECT round-trips.
  • The "both" combos (already passing for iceberg + DuckLake) remain green.

🤖 Generated with Claude Code

LoadExtensions skips INSTALL whenever a .duckdb_extension binary is sitting
in /app/extensions (preseeded from the Dockerfile). That works for the
PostHog-fork extensions (httpfs, ducklake) and the stable stock ones (json,
postgres_scanner): LOAD against the seeded extension_directory file just
works.

It does NOT work for iceberg. After #645 bundled iceberg, the bundle-skip
turned db.Exec("LOAD iceberg") into the same ~60s silent hang the bundle
was meant to fix: the bundled binary is in the cache, but without a prior
INSTALL DuckDB's internal extension metadata leaves LOAD blocked indefinitely
(observed live in mw-dev — worker logs the load-iceberg-extension step
start and then no further activity until the activate-tenant deadline kills
the worker).

Special-case iceberg in shouldInstallExtension so INSTALL runs even when
bundled. INSTALL with the binary already in extension_directory is a cheap
no-op — DuckDB sees the cached file and skips the CDN download — so the
bundle benefit is preserved while LOAD now finds the extension already
installed. The upstream-overwrite risk that motivates skipping INSTALL for
the PostHog forks doesn't apply: we bundle the same stock iceberg build the
DuckDB repository would serve.
@benben benben merged commit e047272 into main Jun 1, 2026
38 of 54 checks passed
@benben benben deleted the ben/iceberg-install-bundled branch June 1, 2026 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant