Skip to content

feat(utilities): Migrate schema providers to HoodieSchema#18893

Draft
Pavan-249 wants to merge 1 commit into
apache:masterfrom
Pavan-249:feat/migrate-schema-providers-hoodieSchema
Draft

feat(utilities): Migrate schema providers to HoodieSchema#18893
Pavan-249 wants to merge 1 commit into
apache:masterfrom
Pavan-249:feat/migrate-schema-providers-hoodieSchema

Conversation

@Pavan-249
Copy link
Copy Markdown

Describe the issue this Pull Request addresses

Part of #14281 (RFC-99 #14263)

Summary and Changelog

Migrate schema providers in hudi-utilities from Avro Schema to HoodieSchema.

  • FilebasedSchemaProvider: replace field/return types, rename overrides to getSourceHoodieSchema()/getTargetHoodieSchema()
  • HiveSchemaProvider: replace field/return types, rename overrides to getSourceHoodieSchema()/getTargetHoodieSchema()
  • SchemaRegistryProvider: replace Schema.Parser() with HoodieSchema.Parser()
  • SimpleSchemaProvider: replace field/return types, rename overrides to getSourceHoodieSchema()/getTargetHoodieSchema()

Note: this covers only the schema providers in hudi-utilities. Remaining hudi-sync files will follow in a separate PR.

Impact

None. RFC-99 guarantees binary compatibility with Avro in this migration phase. No on-disk formats or serialization formats are changed.

Risk Level

Medium. No tests added yet. Migration follows the established RFC-99 pattern but schema provider behavior has not been locally verified. Opening as draft for directional review before proceeding further.

Documentation Update

None

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

Migrate FilebasedSchemaProvider, HiveSchemaProvider, SchemaRegistryProvider,
and SimpleSchemaProvider from Avro Schema to HoodieSchema.

Part of apache#14281 (RFC-99 apache#14263)
@hudi-bot
Copy link
Copy Markdown
Collaborator

hudi-bot commented Jun 1, 2026

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.00000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.92%. Comparing base (a687786) to head (ba9fea7).
⚠️ Report is 37 commits behind head on master.

Files with missing lines Patch % Lines
...ache/hudi/utilities/schema/HiveSchemaProvider.java 0.00% 3 Missing ⚠️
...hudi/utilities/schema/FilebasedSchemaProvider.java 66.66% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (a687786) and HEAD (ba9fea7). Click for more details.

HEAD has 31 uploads less than BASE
Flag BASE (a687786) HEAD (ba9fea7)
spark-scala-tests 12 0
spark-java-tests 18 0
utilities 1 0
Additional details and impacted files
@@              Coverage Diff              @@
##             master   #18893       +/-   ##
=============================================
- Coverage     68.25%   52.92%   -15.34%     
+ Complexity    29336    21534     -7802     
=============================================
  Files          2527     2451       -76     
  Lines        141858   132396     -9462     
  Branches      17627    15474     -2153     
=============================================
- Hits          96831    70074    -26757     
- Misses        37062    56851    +19789     
+ Partials       7965     5471     -2494     
Flag Coverage Δ
common-and-other-modules 44.34% <50.00%> (-0.08%) ⬇️
hadoop-mr-java-client 44.93% <ø> (+0.01%) ⬆️
spark-client-hadoop-common 48.22% <ø> (-0.02%) ⬇️
spark-java-tests ?
spark-scala-tests ?
utilities ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
.../hudi/utilities/schema/SchemaRegistryProvider.java 48.48% <100.00%> (ø)
...he/hudi/utilities/schema/SimpleSchemaProvider.java 100.00% <100.00%> (ø)
...hudi/utilities/schema/FilebasedSchemaProvider.java 70.73% <66.66%> (-4.88%) ⬇️
...ache/hudi/utilities/schema/HiveSchemaProvider.java 0.00% <0.00%> (ø)

... and 959 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants