feat(script): add a bootstrap script for setting up Lakekeeper in the local development environment#4273
feat(script): add a bootstrap script for setting up Lakekeeper in the local development environment#4273mengw15 wants to merge 15 commits intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Thanks for putting this together! Few suggestions:
1. Remove parse-storage-config.py
The Python script + pyhocon dependency adds significant friction (not everyone has pyhocon installed, and it's a niche package). This can be eliminated entirely because storage.conf already supports env var overrides via ${?STORAGE_*} syntax. The bootstrap script can read the same env vars with the same defaults:
LAKEKEEPER_BASE_URI="${STORAGE_ICEBERG_CATALOG_REST_URI:-http://localhost:8181/catalog}"
WAREHOUSE_NAME="${STORAGE_ICEBERG_CATALOG_REST_WAREHOUSE_NAME:-texera}"
S3_REGION="${STORAGE_ICEBERG_CATALOG_REST_REGION:-us-west-2}"
S3_BUCKET="${STORAGE_ICEBERG_CATALOG_REST_S3_BUCKET:-texera-iceberg}"
S3_ENDPOINT="${STORAGE_S3_ENDPOINT:-http://localhost:9000}"
S3_USERNAME="${STORAGE_S3_AUTH_USERNAME:-texera_minio}"
S3_PASSWORD="${STORAGE_S3_AUTH_PASSWORD:-password}"2. Remove awscli dependency for MinIO bucket operations
Requiring awscli just to create a bucket is heavy. Please use curl with S3 API directly (a simple PUT to http://endpoint/bucket-name)
3. Hardcoded binary path requires editing the script
LAKEKEEPER_BINARY_PATH="" at line 37 forces users to edit the script source. Consider:
- Accept it as a CLI argument or env var
- Default to
lakekeeper(assuming it's on$PATH) - Mention
brew install lakekeeperin the setup instructions (works on macOS)
4. Cross-platform considerations
Lakekeeper provides pre-built binaries for macOS ARM, Linux x86_64, and Linux ARM. Seems windows doesn't support it. Please mention it in the PR description
| echo "" | ||
|
|
||
| # Step 3: Check and create MinIO bucket | ||
| echo "Step 3: Checking MinIO bucket..." |
What changes were proposed in this PR?
Lakekeeper is a open-sourced, apache-licensed IcebergRESTCatalog implementation. This PR adds the Lakekeeper bootstrap script for developers to setup the Lakekeeper conveniently.
Lakekeeper provides pre-built binaries for macOS ARM64, Linux x86_64, and Linux ARM64
(releases). Windows is not supported by Lakekeeper upstream, so this script targets macOS and Linux developers only.
Any related issues, documentation, discussions?
Issue: close #4353 .
Documents: wiki page.
How was this PR tested?
Manually tested by launching Lakekeeper with this script and set it as the rest catalog for Texera.
Was this PR authored or co-authored using generative AI tooling?
Co-authored with Claude Code