Skip to content

chore: remove dead useDecimal128 plumbing#4382

Open
andygrove wants to merge 2 commits into
apache:mainfrom
andygrove:decimal128-audit
Open

chore: remove dead useDecimal128 plumbing#4382
andygrove wants to merge 2 commits into
apache:mainfrom
andygrove:decimal128-audit

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #.

Rationale for this change

The internal spark.comet.use.decimal128 conf is dead plumbing:

  • The conf is write-only: three callers (CometExecRule, CometShuffleExchangeExec, CometNativeExec.doExecuteColumnar) force it to "true" right before native execution, but nothing reads it anywhere in the repo.
  • The useDecimal128 boolean threaded through the entire CometVector hierarchy is always true in production. Both NativeUtil.importVector and NativeUtil.rootAsBatch hardcode true, so the !useDecimal128 branch in CometVector.getDecimal() is unreachable.
  • The Rust SparkParquetOptions.use_decimal_128 field is declared, defaulted to false in both ::new constructors, and never set or read elsewhere.

This looks like legacy from the era when the JVM Parquet reader produced Decimal32/Decimal64 vectors. With the native scan path being the only path, every decimal column already lands in 128-bit form before reaching the JVM.

What changes are included in this PR?

  • Remove CometConf.COMET_USE_DECIMAL_128 and its three setConfString writers.
  • Remove the useDecimal128 field and constructor parameter from CometVector, CometDecodedVector, CometPlainVector, CometDictionaryVector, CometListVector, CometMapVector, CometStructVector, CometDelegateVector, CometSelectionVector.
  • Collapse CometVector.getDecimal() to the always-128-bit path.
  • Drop the true arg from both NativeUtil callers of CometVector.getVector and from the codegen template in CometBatchKernelCodegenInput.
  • Remove use_decimal_128 from SparkParquetOptions and clean up three stale comments in columnar_to_row.rs that referenced the conf.
  • Update TestColumnReader callers to the simplified constructor.

How are these changes tested?

Covered by existing tests:

  • ParquetReadV1Suite "decimals": passes (Standard + Legacy modes plus sibling decimal tests).
  • CometShuffleSuite "fix: StreamReader should read shuffled decimal columns as Decimal128": passes (renamed from the old useDecimal128 regression test, still pins NativeUtil.rootAsBatch against decimal-128 expectation).
  • The Seq(true, false).foreach { useDecimal128 => ... } sweep in ParquetReadSuite "decimals" was flattened since the conf no longer exists; the remaining batch-size and precision matrix still exercises the decimal read path.
  • make and cargo clippy --all-targets --workspace -- -D warnings both green.

andygrove added 2 commits May 21, 2026 09:28
The internal `spark.comet.use.decimal128` conf had three write-only
callers and zero readers. The `useDecimal128` boolean threaded through
the entire `CometVector` hierarchy was always `true` in production
(both `NativeUtil` call sites hardcoded `true`), so the `false` branch
of `CometVector.getDecimal()` was unreachable. The Rust
`SparkParquetOptions.use_decimal_128` field was declared, defaulted to
`false`, and never read.

Removed:
- `CometConf.COMET_USE_DECIMAL_128` and its three `setConfString` writers
- `useDecimal128` field/parameter across `CometVector` and all subclasses
- The dead branch in `CometVector.getDecimal()`
- `use_decimal_128` from `SparkParquetOptions`
- The `Seq(true, false).foreach` sweep in `ParquetReadSuite "decimals"`
# Conflicts:
#	spark/src/main/scala/org/apache/spark/sql/comet/operators.scala
Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding the dead code, @andygrove!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants