GH-3401: Remove redundant MAP_KEY_VALUE annotation at middle level of logical type map#3402
GH-3401: Remove redundant MAP_KEY_VALUE annotation at middle level of logical type map#3402HuaHuaY wants to merge 1 commit intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR removes the redundant MAP_KEY_VALUE annotation from newly generated Parquet schemas to comply with the Parquet format specification. According to the spec, MAP_KEY_VALUE is a legacy annotation for backward compatibility with ConvertedType and should not be used in new schemas. The change affects schema generation while preserving backward compatibility for reading existing schemas.
Changes:
- Removed
MAP_KEY_VALUEannotation fromConversionPatterns.mapType()method - Updated documentation in Types.java and package-info.java to reflect the removal
- Updated test expectations across Thrift and Avro modules to match new schema format
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| parquet-column/src/main/java/org/apache/parquet/schema/ConversionPatterns.java | Core implementation: Removed MAP_KEY_VALUE annotation from mapType() method when creating the repeated key_value group |
| parquet-column/src/main/java/org/apache/parquet/schema/Types.java | Updated documentation comments to remove MAP_KEY_VALUE annotation from example schemas |
| parquet-avro/src/main/java/org/apache/parquet/avro/package-info.java | Updated package documentation to remove MAP_KEY_VALUE annotation from map type descriptions in conversion tables |
| parquet-thrift/src/test/java/org/apache/parquet/thrift/TestThriftSchemaConverter.java | Updated test expectations to verify schemas without MAP_KEY_VALUE annotation |
| parquet-thrift/src/test/java/org/apache/parquet/thrift/TestThriftSchemaConverterProjectUnion.java | Updated test expectations for union projection tests to reflect removal of MAP_KEY_VALUE |
| parquet-avro/src/test/java/org/apache/parquet/avro/TestAvroSchemaConverter.java | Updated Avro schema conversion test expectations to match new format |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
66e6fe1 to
5971e79
Compare
|
I think this change makes sense because the Parquet spec does not require the middle layer to use |
I don't have any concerns, @wgtmac. LGTM. |
|
Thanks @gszadovszky! I'll leave it open for a few days just in case. |
Fokko
left a comment
There was a problem hiding this comment.
I think removing this reduces visual noise, and I like it.
Regarding backward compatibility, could folks use it in their tests as we do here, as shown in this PR?
|
FYI, this PR only removed the MAP_KEY_VALUE annotation during writing file. And it didn't modify any code about reading file. So we can still be able to read the map structure which has the annotation at middle level, and there are some reserved corresponding parsing tests. |
Rationale for this change
Remove redundant
MAP_KEY_VALUEannotation which is not required in parquet's spec.What changes are included in this PR?
ConversionPatternswill not passLogicalTypeAnnotation.MapKeyValueTypeAnnotationtoGroupTypefunction.Are these changes tested?
Yes.
Are there any user-facing changes?
Yes. The
MAP_KEY_VALUEannotation at middle level of thrift structuremap<..., ...>will be removed.Closes #3401