From 98385da5b63598f66970470c9fa614220527a3da Mon Sep 17 00:00:00 2001 From: Benjamin Knofe-Vider Date: Mon, 1 Jun 2026 12:44:08 +0200 Subject: [PATCH] fix(activation): always INSTALL iceberg even when bundled MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit LoadExtensions skips INSTALL whenever a .duckdb_extension binary is sitting in /app/extensions (preseeded from the Dockerfile). That works for the PostHog-fork extensions (httpfs, ducklake) and the stable stock ones (json, postgres_scanner): LOAD against the seeded extension_directory file just works. It does NOT work for iceberg. After #645 bundled iceberg, the bundle-skip turned db.Exec("LOAD iceberg") into the same ~60s silent hang the bundle was meant to fix: the bundled binary is in the cache, but without a prior INSTALL DuckDB's internal extension metadata leaves LOAD blocked indefinitely (observed live in mw-dev — worker logs the load-iceberg-extension step start and then no further activity until the activate-tenant deadline kills the worker). Special-case iceberg in shouldInstallExtension so INSTALL runs even when bundled. INSTALL with the binary already in extension_directory is a cheap no-op — DuckDB sees the cached file and skips the CDN download — so the bundle benefit is preserved while LOAD now finds the extension already installed. The upstream-overwrite risk that motivates skipping INSTALL for the PostHog forks doesn't apply: we bundle the same stock iceberg build the DuckDB repository would serve. --- server/server.go | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/server/server.go b/server/server.go index 3dcbf78e..b610ade5 100644 --- a/server/server.go +++ b/server/server.go @@ -1279,6 +1279,30 @@ func LoadExtensions(db *sql.DB, extensions []string) error { } func shouldInstallExtension(name string) bool { + // The bundle-skip optimization assumes a bundled .duckdb_extension on disk + // is enough — LOAD finds the file via extension_directory and DuckDB treats + // it as installed. That holds for the PostHog-fork extensions (httpfs, + // ducklake) and the stable ones we ship (json, postgres_scanner): LOAD + // against the seeded extension_directory file just works. + // + // Iceberg behaves differently. With the bundled binary in place but no + // prior INSTALL, db.Exec("LOAD iceberg") blocks indefinitely (observed on + // a worker in mw-dev: warm-up → activation hangs past the activate-tenant + // deadline at the LOAD with no progress and no Go log; the bundled binary + // is sitting in the cache the whole time). The CDN INSTALL — which the + // bundle (#645) was added specifically to avoid — turns into the same + // silent stall the bundle was meant to fix when LOAD is left to figure + // things out on its own. + // + // Running INSTALL when the binary is already in extension_directory is a + // cheap no-op (DuckDB sees the cached file and skips the CDN download), so + // special-casing iceberg keeps the bundle benefit (no first-load fetch + // from the CDN) and unblocks LOAD. The upstream-overwrite risk that + // motivates skipping INSTALL for the PostHog forks doesn't apply here: we + // bundle the same stock iceberg build the repository would serve. + if name == "iceberg" { + return true + } return !hasBundledExtensionBinary(name) }