Skip to content

[FLINK-39834] Fix Oracle UNISTR parsing with embedded concat operator#4426

Open
paulo-t wants to merge 1 commit into
apache:masterfrom
paulo-t:fix/oracle-unistr-embedded-concat-20260603
Open

[FLINK-39834] Fix Oracle UNISTR parsing with embedded concat operator#4426
paulo-t wants to merge 1 commit into
apache:masterfrom
paulo-t:fix/oracle-unistr-embedded-concat-20260603

Conversation

@paulo-t
Copy link
Copy Markdown

@paulo-t paulo-t commented Jun 3, 2026

What is the purpose of the change

This change fixes Oracle CDC UNISTR decoding when the UNISTR quoted payload contains the character sequence ||.

Oracle LogMiner can emit NVARCHAR2 values as UNISTR(...). The current Debezium 1.9.8.Final UnistrHelper splits directly on ||, so a value like:

UNISTR('\\592A...4000||\\518D...||C440100VEH26071668')

can be split into invalid fragments. The fallback path then appends the original expression repeatedly, causing downstream values to contain duplicated UNISTR(...) text and potentially exceed sink column length limits.

Brief change log

  • Add a patched io.debezium.connector.oracle.logminer.UnistrHelper in the Oracle CDC connector to tokenize UNISTR expressions and split only on SQL concatenation operators outside quoted UNISTR data.
  • Add regression tests for normal UNISTR decoding, external UNISTR concatenation, and embedded || inside a UNISTR payload.
  • Exclude Debezium's original UnistrHelper.class from the shaded Oracle pipeline connector so the patched helper is packaged.

Verifying this change

This change added tests and can be verified as follows:

mvn -pl flink-cdc-connect/flink-cdc-source-connectors/flink-connector-oracle-cdc \
  -DskipITs -DskipE2eTests -Dcheckstyle.skip -Dspotless.check.skip=true \
  -Dtest=io.debezium.connector.oracle.logminer.UnistrHelperTest test

Tests run: 4, Failures: 0, Errors: 0, Skipped: 0.

Apache Jira: https://issues.apache.org/jira/browse/FLINK-39834

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant