feat(connectors): add SurrealDB sink connector#3453
Conversation
SurrealDB is a document database target for Iggy connector users, so the sink writes batches with deterministic record ids and bulk INSERT IGNORE to keep runtime redelivery idempotent without per-message round trips. Constraint: User explicitly requested the latest SurrealDB Rust SDK and chose to keep it despite BUSL-1.1 license-validation warnings for SurrealDB crates. Constraint: Local Docker daemon was unavailable, so real-container integration execution could not run here. Rejected: Per-message SDK writes | too many round trips and weaker batching throughput. Rejected: Using the testcontainers SurrealDB module | module source hardcodes an older SurrealDB image. Confidence: medium Scope-risk: moderate Directive: Keep record ids deterministic across releases; changing build_record_id breaks replay idempotency. Tested: cargo fmt --all; cargo sort --no-format --workspace; cargo clippy --all-features --all-targets -- -D warnings; cargo check --all --all-features; cargo test -p iggy_connector_surrealdb_sink; cargo test -p integration --no-run connectors::surrealdb; cargo test --locked --doc; cargo doc --no-deps --all-features --quiet; taplo/license/shellcheck/version/diff/binary checks; prek install Not-tested: Docker-backed SurrealDB integration execution, because Docker daemon was not running locally.
7b8305a to
48c3a9d
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3453 +/- ##
=============================================
- Coverage 74.26% 50.02% -24.24%
Complexity 937 937
=============================================
Files 1259 1258 -1
Lines 125969 110755 -15214
Branches 101643 86474 -15169
=============================================
- Hits 93551 55410 -38141
- Misses 29403 52546 +23143
+ Partials 3015 2799 -216
🚀 New features to boost your workflow:
|
|
/author |
|
Please check the pre-checks failure |
|
Sure |
HawkEye maps Rust files to the double-slash license style, so the block comments in the new SurrealDB connector files were treated as missing headers by CI. Constraint: CI runs the updated HawkEye-based license check with strict header matching. Confidence: high Scope-risk: narrow Tested: PATH=/opt/homebrew/bin:/Users/radudiaconu/.vite-plus/bin:/Users/radudiaconu/.codex/tmp/arg0/codex-arg0uTrL1r:/Users/radudiaconu/Library/pnpm/bin:/Users/radudiaconu/.opencode/bin:/opt/homebrew/opt/ruby/bin:/opt/homebrew/opt/ruby/bin:/opt/homebrew/opt/ruby/bin:/opt/homebrew/lib/ruby/gems/4.0.0/bin:/Users/radudiaconu/.local/bin:/Users/radudiaconu/Library/Application Support/Herd/bin/:/Users/radudiaconu/.bun/bin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/pkg/env/global/bin:/Library/Apple/usr/bin:/usr/local/share/dotnet:~/.dotnet/tools:/opt/homebrew/bin:/opt/zerobrew/bin:/Users/radudiaconu/.zerobrew/bin:/Users/radudiaconu/.cargo/bin:/Users/radudiaconu/Library/Application Support/JetBrains/Toolbox/scripts:/Users/radudiaconu/Library/Android/sdk/platform-tools:/Applications/Codex.app/Contents/Resources ./scripts/ci/license-headers.sh --check; cargo fmt --all --check; git diff --check
The Rust pre-merge machete job reported that the SurrealDB sink crate declared toml without using it. Removing the dev-dependency is simpler than adding an ignore entry. Constraint: CI runs cargo machete --with-metadata and fails on unused dependencies. Rejected: Add cargo-machete metadata ignore | the dependency is genuinely unused. Confidence: high Scope-risk: narrow Tested: cargo sort --no-format --workspace; cargo test -p iggy_connector_surrealdb_sink; PATH=/opt/homebrew/bin:/Users/radudiaconu/.vite-plus/bin:/Users/radudiaconu/.codex/tmp/arg0/codex-arg0uTrL1r:/Users/radudiaconu/Library/pnpm/bin:/Users/radudiaconu/.opencode/bin:/opt/homebrew/opt/ruby/bin:/opt/homebrew/opt/ruby/bin:/opt/homebrew/opt/ruby/bin:/opt/homebrew/lib/ruby/gems/4.0.0/bin:/Users/radudiaconu/.local/bin:/Users/radudiaconu/Library/Application Support/Herd/bin/:/Users/radudiaconu/.bun/bin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/pkg/env/global/bin:/Library/Apple/usr/bin:/usr/local/share/dotnet:~/.dotnet/tools:/opt/homebrew/bin:/opt/zerobrew/bin:/Users/radudiaconu/.zerobrew/bin:/Users/radudiaconu/.cargo/bin:/Users/radudiaconu/Library/Application Support/JetBrains/Toolbox/scripts:/Users/radudiaconu/Library/Android/sdk/platform-tools:/Applications/Codex.app/Contents/Resources ./scripts/ci/license-headers.sh --check; cargo fmt --all --check; git diff --check; cargo metadata confirms toml is absent from iggy_connector_surrealdb_sink
|
/ready |
|
/author - could you please check the pre-checks failures |
|
/ready |
| iggy_connector_sdk = { workspace = true } | ||
| secrecy = { workspace = true } | ||
| serde = { workspace = true } | ||
| serde_json = { workspace = true } |
There was a problem hiding this comment.
surrealdb = { workspace = true } pulls surrealdb-core as a transitive dependency. surrealdb-core is licensed under Business Source License 1.1 (BUSL-1.1), an OSI-non-approved license incompatible with Apache 2.0 redistribution. The PR author explicitly flags this: ./scripts/ci/third-party-licenses.sh --validate --manifest core/connectors/sinks/surrealdb_sink/Cargo.toml reports BUSL-1.1 failures. The workspace-level CI passes only because that step runs against core/server/Cargo.toml and core/cli/Cargo.toml, which do not include this crate. The crates.io page for surrealdb v3.1.4 lists the license as "non-standard" and the README confirms surrealdb-core is BUSL-1.1. This cannot be distributed in an Apache Software Foundation project.
|
|
||
| Ok(()) | ||
| } | ||
|
|
There was a problem hiding this comment.
process_messages — every insertion failure is logged and swallowed, function always returns Ok(()):
if let Some(error) = outcome.error {
self.insertion_errors.fetch_add(batch.len() as u64, Ordering::Relaxed);
error!("Failed to insert SurrealDB batch ...: {error}");
}
// no early return, no Err propagation
Ok(())
consume() at L258 propagates this directly. The runtime receives Ok(()) and advances the consumer offset, permanently losing all messages from failed batches with no possibility of redelivery. Other connectors in the repo (Meilisearch, S3, Elasticsearch) return Err from consume() to trigger runtime retry.
| Self { stream, topic } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
build_record_id — the deterministic ID uses message_id (u128) not the message offset:
id.push_str("_m");
let _ = write!(&mut id, "{message_id:032x}");
INSERT IGNORE INTO {table} $records silently deduplicates on matching IDs. If a producer sends messages with id = 0 (the default when IggyMessage::builder().build() is used without setting .id()), every such message in the same stream/topic/partition maps to the same record ID. Only the first insert succeeds; all subsequent messages with id = 0 are silently dropped by SurrealDB with no error returned to the connector. The offset is written as a field but is not part of the record ID. Replace or augment with message.offset, which is unique per partition.
| } | ||
| .with_config_defaults() | ||
| } | ||
|
|
There was a problem hiding this comment.
with_config_defaults — max_retries clamped to .max(1):
self.max_retries = self.config.max_retries.unwrap_or(DEFAULT_MAX_RETRIES).max(1);
Setting max_retries = 0 is silently raised to 1. Document the minimum-1 behavior or reject 0 with an error. Severity:
|
|
||
| Ok(client) | ||
| } | ||
|
|
There was a problem hiding this comment.
wait_until_ready — polls by attempting a full create_client() (WebSocket connect + sign in + use_ns/use_db + health check) on every attempt:
for _ in 0..SURREALDB_BOOT_ATTEMPTS {
if let Ok(client) = self.create_client().await
&& client.health().await.is_ok()
{
return Ok(());
}
sleep(Duration::from_millis(SURREALDB_BOOT_INTERVAL_MS)).await;
}
With SURREALDB_BOOT_ATTEMPTS = 120 and SURREALDB_BOOT_INTERVAL_MS = 250, the max wait is 30s. Connection errors during the boot window are swallowed via if let Ok. This is acceptable in test code, but swallowing all errors including non-transient auth failures means a misconfigured test fixture fails with "SurrealDB did not become ready" rather than the actual error.
|
@countradooku - Please check the Cargo.toml comment on licensing and distribution in the apache redistribution. Please check with any other packages which are distributed with apache license. Unresolved this could be a show stopper. otherwise looks good with more of nits. |
|
/author |
Summary
Adds a SurrealDB sink connector for writing Iggy messages into SurrealDB using the latest SurrealDB Rust SDK version available during implementation,
3.1.4.The connector supports deterministic record IDs, bulk
INSERT IGNOREwrites for idempotent replay, configurable batch sizing, root/namespace/database/no-auth modes, optional table and offset-index definition, payload modes (auto,json,text,base64), metadata/header/checksum/origin timestamp fields, retry/backoff handling, and runtime metrics logging.This also wires the connector into workspace membership, connector docs, example runtime config, binary artifact builds, edge-release output, version bump scripts, and Docker-backed integration test scaffolding.
Tests
cargo fmt --allcargo sort --no-format --workspacecargo clippy --all-features --all-targets -- -D warningscargo check --all --all-featurescargo test -p iggy_connector_surrealdb_sinkcargo test -p integration --no-run connectors::surrealdbcargo test --locked --doccargo doc --no-deps --all-features --quiet./scripts/ci/taplo.sh./scripts/ci/license-headers.sh./scripts/ci/shellcheck.sh./scripts/ci/binary-artifacts.sh --check./scripts/extract-version.sh --checkgit diff --checkprek installKnown Review Items
./scripts/ci/third-party-licenses.sh --validate --manifest core/connectors/sinks/surrealdb_sink/Cargo.tomlreports BUSL-1.1 license failures from SurrealDB SDK crates. This PR intentionally keeps the SDK dependency because the connector targets the latest SurrealDB SDK.