Skip to content

novucs/local-bigquery

Repository files navigation

Local BigQuery

A local BigQuery implementation written in Python.

Uses SQLGlot for translation, and DuckDB for execution.

Usage

Grab the container, run it, and hit it with a BigQuery client.

Docker

Start the container

docker run --init -d --rm -p 9050:9050 -v /tmp/local-bigquery/:/data --name bigquery ghcr.io/novucs/local-bigquery:0.2.3

Enter the REPL

docker exec -it bigquery repl

Delete all projects, datasets, and tables

docker exec -it bigquery reset

Stop the container

docker stop bigquery

Docker Compose

volumes:
  bigquery_data: {}
services:
  bigquery:
    image: ghcr.io/novucs/local-bigquery:0.2.3
    ports:
      - "9050:9050"
    environment:
      # Optional configuration, defaults are shown
      BIGQUERY_PORT: 9050
      BIGQUERY_HOST: 0.0.0.0
      DATA_DIR: /data
      DEFAULT_PROJECT_ID: local
      DEFAULT_DATASET_ID: local
      INTERNAL_PROJECT_ID: internal
      INTERNAL_DATASET_ID: internal
      # Support for external connections to Postgres, requires an available Postgres instance.
      # SELECT * FROM EXTERNAL_QUERY('us.default', 'SELECT 1');
      POSTGRES_CONNECTION_ID: us.default
      POSTGRES_URI: postgresql://postgres:example@db:5432/postgres
    volumes:
      - bigquery_data:/data

BQ CLI

bq --api http://localhost:9050 query "SELECT 1"

Python

pip install google-cloud-bigquery
from google.cloud import bigquery
client = bigquery.Client(client_options={"api_endpoint": "http://localhost:9050"})
# ... your code here ...

SQLAlchemy

pip install sqlalchemy-bigquery
from google.cloud import bigquery
from sqlalchemy import create_engine
client = bigquery.Client(client_options={"api_endpoint": "http://localhost:9050"})
engine = create_engine("bigquery://project/dataset", connect_args={"client": client})
# ... your code here ...

Testcontainers

from google.cloud import bigquery
from testcontainers.core.container import DockerContainer

bigquery_image = "ghcr.io/novucs/local-bigquery:0.2.3"
with DockerContainer(bigquery_image).with_exposed_ports(9050) as container:
    host = container.get_container_host_ip()
    port = container.get_exposed_port(9050)
    client = bigquery.Client(client_options={"api_endpoint": f"http://{host}:{port}"})

Go

go get github.com/googleapis/google-cloud-go/bigquery
package main

import (
    "context"

    "cloud.google.com/go/bigquery"
    "google.golang.org/api/option"
)

func main() {
    ctx := context.Background()
    client, err := bigquery.NewClient(ctx, "project", option.WithEndpoint("http://localhost:9050/bigquery/v2/"))
    // ... your code here ...
}

About

Run BigQuery locally. A BigQuery emulator for local testing and development.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages