Skip to content

Kafka Connect: Split route field path once instead of per record#16655

Open
wombatu-kun wants to merge 1 commit into
apache:mainfrom
wombatu-kun:kafka-connect-presplit-route-field
Open

Kafka Connect: Split route field path once instead of per record#16655
wombatu-kun wants to merge 1 commit into
apache:mainfrom
wombatu-kun:kafka-connect-presplit-route-field

Conversation

@wombatu-kun
Copy link
Copy Markdown
Contributor

When routing by a field (static routing with iceberg.tables.route-field, or dynamic routing), SinkWriter extracted the route value for every record via RecordUtils.extractFromRecordValue(value, routeField), which re-parsed the dotted field path with Splitter.on('.').splitToList(routeField) on each call. The route field is fixed for the connector's lifetime, so this re-parse is pure per-record overhead.

This splits the path once in the SinkWriter constructor and adds a RecordUtils.extractFromRecordValue(Object, List<String>) overload that takes the already-split path; the existing String overload now delegates to it, so other callers are unchanged. Behavior is identical.

A throwaway A/B microbench over the whole extractFromRecordValue method (2M iterations x 9 trials, median; baseline = current String overload that splits per call, optimized = List overload with the path split once) showed:

record value route field before after faster
struct key 52.8 ns 5.7 ns 89%
struct data.id.key 162.2 ns 32.0 ns 80%
map key 54.3 ns 6.1 ns 89%
map data.id.key 144.3 ns 21.6 ns 85%

That is roughly 47 ns saved per record for a single-segment route field and ~120-130 ns for a three-segment path, paid once per record on the routing path. The numbers are indicative wall-clock from a microbench, not JMH.

Existing TestSinkWriter and TestRecordUtils cover both routing modes and the extraction overloads.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant