Upgrade DataFusion to 54#8044
Conversation
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Merging this PR will improve performance by 23.7%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
304.4 ns | 246.1 ns | +23.7% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing adamg/df-54 (dfd1f68) with develop (19a1fb3)
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 0.922x ➖ datafusion / vortex-file-compressed (0.922x ➖, 5↑ 0↓)
|
File Sizes: PolarSignals ProfilingNo file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.963x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.961x ➖, 1↑ 0↓)
datafusion / parquet (0.992x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.952x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.920x ➖, 2↑ 0↓)
duckdb / parquet (0.964x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeNo file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.983x ➖, 2↑ 1↓)
datafusion / vortex-compact (0.995x ➖, 1↑ 2↓)
datafusion / parquet (0.973x ➖, 4↑ 2↓)
datafusion / arrow (0.937x ➖, 6↑ 1↓)
duckdb / vortex-file-compressed (0.976x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.988x ➖, 0↑ 0↓)
duckdb / parquet (0.971x ➖, 4↑ 1↓)
duckdb / duckdb (0.987x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMENo file size changes detected. |
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.998x ➖, 7↑ 6↓)
datafusion / vortex-compact (0.943x ➖, 21↑ 2↓)
datafusion / parquet (0.990x ➖, 10↑ 10↓)
duckdb / vortex-file-compressed (0.962x ➖, 21↑ 0↓)
duckdb / vortex-compact (0.999x ➖, 4↑ 0↓)
duckdb / parquet (1.007x ➖, 1↑ 1↓)
duckdb / duckdb (1.009x ➖, 1↑ 2↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMENo file size changes detected. |
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (0.996x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.967x ➖, 0↑ 0↓)
duckdb / parquet (1.002x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsNo file size changes detected. |
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.908x ➖, 1↑ 2↓)
datafusion / vortex-compact (0.980x ➖, 0↑ 0↓)
datafusion / parquet (0.991x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.034x ➖, 1↑ 1↓)
duckdb / vortex-compact (1.032x ➖, 0↑ 0↓)
duckdb / parquet (0.978x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.892x ✅, 17↑ 2↓)
datafusion / vortex-compact (0.949x ➖, 9↑ 3↓)
datafusion / parquet (0.971x ➖, 6↑ 2↓)
datafusion / arrow (0.904x ➖, 10↑ 2↓)
duckdb / vortex-file-compressed (0.959x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.992x ➖, 0↑ 3↓)
duckdb / parquet (0.999x ➖, 0↑ 0↓)
duckdb / duckdb (0.969x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMENo file size changes detected. |
Benchmarks: Clickbench on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.881x ✅, 18↑ 5↓)
datafusion / parquet (1.164x ❌, 6↑ 12↓)
duckdb / vortex-file-compressed (1.020x ➖, 0↑ 2↓)
duckdb / parquet (0.967x ➖, 2↑ 0↓)
duckdb / duckdb (1.014x ➖, 1↑ 2↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
🚨🚨🚨❌❌❌ SQL BENCHMARK FAILED ❌❌❌🚨🚨🚨Benchmark |
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.907x ➖, 1↑ 0↓)
datafusion / vortex-compact (1.036x ➖, 0↑ 2↓)
datafusion / parquet (0.965x ➖, 2↑ 3↓)
duckdb / vortex-file-compressed (0.947x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.975x ➖, 0↑ 0↓)
duckdb / parquet (0.952x ➖, 0↑ 0↓)
Full attributed analysis
|
Summary
This PR includes an upgrade of our DataFusion dependency/integration to the upcoming 54 release. It aims to make the minimal amount of changes, and implementing the new
MorselizerAPI will be part of a future PR (I have an old PR that was based on an earlier PoC, I'll try and pull stuff from there when the time comes).54.0.0(Apr 2026 / May 2026) apache/datafusion#21080