Skip to content

build: bundle iceberg extension in worker images#645

Merged
benben merged 1 commit into
mainfrom
ben/bundle-iceberg-extension
Jun 1, 2026
Merged

build: bundle iceberg extension in worker images#645
benben merged 1 commit into
mainfrom
ben/bundle-iceberg-extension

Conversation

@benben
Copy link
Copy Markdown
Member

@benben benben commented Jun 1, 2026

Problem

Iceberg-only tenant activation silently exceeds the ~60s activate-tenant deadline and fails (DeadlineExceeded), as observed on mw-dev (ben-{cnpg,ext,aur}-ice). Iceberg + DuckLake tenants are unaffected.

Pinpointed via #642's per-step instrumentation on a real worker:

worker pre-warmed (ducklake ext bundled, loaded in ~120ms)
step=count-catalogs   elapsed=189ns         ✓ (~1ms)
step=load-iceberg-extension elapsed=1.7ms   …
(no further step logs)
worker shutting down (60.7s after step start) — no DuckDB error

LoadExtensions(["iceberg"]) blocks for the entire deadline.

Why iceberg-only

The four DuckDB extensions duckgres relies on (httpfs, ducklake, json, postgres_scanner) are pre-seeded into the bundled extension cache by Dockerfile / Dockerfile.worker. iceberg is not — it gets INSTALLed on-demand at first attach, fetching from extensions.duckdb.org.

  • iceberg-only: LoadExtensions("iceberg") is the first network extension install of the worker process → silent cold install/download hangs past the activate deadline.
  • iceberg + DuckLake: LoadExtensions("delta") runs first (via AttachDeltaCatalog), priming DuckDB's extension subsystem; the subsequent LoadExtensions("iceberg") finishes within the deadline (verified live with the existing "both" combos — iceberg writes succeed end-to-end).

Worker→CDN reachability is fine (extensions.duckdb.org:443 OPEN from a born-as-worker pod) and worker→Lakekeeper is fine (#11444). The hang is inside DuckDB's INSTALL path.

Fix

Bundle iceberg at image build time, identical to the other core-repo extensions:

curl -fsSL "${DUCKDB_EXTENSION_REPOSITORY}/v${DUCKDB_EXTENSION_VERSION}/linux_${TARGETARCH}/iceberg.duckdb_extension.gz" \
  | gunzip > "/build/duckdb-extensions/.../iceberg.duckdb_extension"
…
for f in httpfs ducklake json postgres_scanner iceberg; do …  # size check

Applied to both Dockerfile and Dockerfile.worker. Subsequent LoadExtensions("iceberg") becomes a local cache hit — same path as the other bundled extensions — for every iceberg-using tenant (Lakekeeper + S3Tables backends alike).

Test

Once the new image is deployed to mw-dev I'll re-run ben-ext-ice and confirm:

  • step=load-iceberg-extension finishes in ms instead of timing out
  • Iceberg-only SELECT 1 + CREATE TABLE iceberg.public.<t> / INSERT / SELECT round-trip succeeds

🤖 Generated with Claude Code

Iceberg extension was downloaded on-demand at first use, unlike httpfs,
ducklake, json, and postgres_scanner which the Dockerfiles pre-seed into
the bundled extension cache. That on-demand INSTALL silently blocks the
iceberg-only tenant activation past the ~60s activate-tenant deadline
(observed on mw-dev with the per-step logging from #642: count-catalogs
completes in ~1ms, load-iceberg-extension never returns, worker is
retired at 60.7s with no DuckDB-level error). Iceberg+DuckLake tenants
don't hit it because LoadExtensions(delta) runs first and primes
DuckDB's extension subsystem.

Bundle iceberg the same way as the others: curl the .duckdb_extension.gz
from ${DUCKDB_EXTENSION_REPOSITORY} at build time, gunzip into
/build/duckdb-extensions/v${DUCKDB_EXTENSION_VERSION}/linux_${TARGETARCH}/,
and add it to the size-check loop. Applies to both the standalone
Dockerfile and Dockerfile.worker so worker pods get a local cache hit on
LoadExtensions("iceberg") instead of a CDN fetch.

Eliminates the iceberg-only activation timeout and brings the activation
cost of LoadExtensions("iceberg") in line with the other bundled
extensions for every iceberg-using tenant (lakekeeper + s3tables
backends alike).
@benben benben requested a review from a team June 1, 2026 09:19
@benben benben merged commit 2f2f38d into main Jun 1, 2026
21 of 22 checks passed
@benben benben deleted the ben/bundle-iceberg-extension branch June 1, 2026 09:31
benben added a commit that referenced this pull request Jun 1, 2026
LoadExtensions skips INSTALL whenever a .duckdb_extension binary is sitting
in /app/extensions (preseeded from the Dockerfile). That works for the
PostHog-fork extensions (httpfs, ducklake) and the stable stock ones (json,
postgres_scanner): LOAD against the seeded extension_directory file just
works.

It does NOT work for iceberg. After #645 bundled iceberg, the bundle-skip
turned db.Exec("LOAD iceberg") into the same ~60s silent hang the bundle
was meant to fix: the bundled binary is in the cache, but without a prior
INSTALL DuckDB's internal extension metadata leaves LOAD blocked indefinitely
(observed live in mw-dev — worker logs the load-iceberg-extension step
start and then no further activity until the activate-tenant deadline kills
the worker).

Special-case iceberg in shouldInstallExtension so INSTALL runs even when
bundled. INSTALL with the binary already in extension_directory is a cheap
no-op — DuckDB sees the cached file and skips the CDN download — so the
bundle benefit is preserved while LOAD now finds the extension already
installed. The upstream-overwrite risk that motivates skipping INSTALL for
the PostHog forks doesn't apply: we bundle the same stock iceberg build the
DuckDB repository would serve.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant