Skip to content

fix: dfbench respects DATAFUSION_RUNTIME_MEMORY_LIMIT env var#20631

Merged
adriangb merged 1 commit intoapache:mainfrom
pydantic:fix-dfbench-memory-limit-env-var
Mar 15, 2026
Merged

fix: dfbench respects DATAFUSION_RUNTIME_MEMORY_LIMIT env var#20631
adriangb merged 1 commit intoapache:mainfrom
pydantic:fix-dfbench-memory-limit-env-var

Conversation

@adriangb
Copy link
Contributor

@adriangb adriangb commented Mar 1, 2026

Which issue does this PR close?

N/A — discovered while running benchmarks with bench.sh.

Rationale for this change

When running benchmarks via bench.sh / dfbench, setting DATAFUSION_RUNTIME_MEMORY_LIMIT=2G is ignored for memory pool enforcement. Most DATAFUSION_* env vars work because SessionConfig::from_env() picks them up, but the memory limit is a special case — it requires constructing a MemoryPool in the RuntimeEnv, which dfbench only did when --memory-limit was passed as a CLI flag.

What changes are included in this PR?

In runtime_env_builder(), when self.memory_limit (CLI flag) is None, fall back to reading the DATAFUSION_RUNTIME_MEMORY_LIMIT env var using the existing parse_memory_limit() function. The CLI flag still takes precedence when provided.

Are these changes tested?

Yes — added test_runtime_env_builder_reads_env_var which sets the env var, constructs a CommonOpt with no CLI memory limit, and verifies the resulting RuntimeEnv has a Finite(2GB) memory pool.

Are there any user-facing changes?

dfbench now honors the DATAFUSION_RUNTIME_MEMORY_LIMIT environment variable as a fallback when --memory-limit is not passed on the command line. No breaking changes — existing CLI flag behavior is unchanged.

🤖 Generated with Claude Code

@adriangb adriangb force-pushed the fix-dfbench-memory-limit-env-var branch from 0e734ab to 1c8ec9b Compare March 1, 2026 15:20
@adriangb
Copy link
Contributor Author

adriangb commented Mar 1, 2026

@alamb a small fix

let mut rt_builder = RuntimeEnvBuilder::new();
const NUM_TRACKED_CONSUMERS: usize = 5;
if let Some(memory_limit) = self.memory_limit {
// Use CLI --memory-limit if provided, otherwise fall back to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could use https://docs.rs/datafusion/latest/datafusion/prelude/struct.SessionConfig.html#method.from_env and pick up any environment variables, rather than just the memory limit?

Copy link
Contributor Author

@adriangb adriangb Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That only loads the SessionConfig options, the memory limit is part of the RuntimeEnv which comes into play via SessionContext (SessionConfig + RuntimeEnv).

So we'd have to do some refactoring to create SessionContext::from_env(), or add the memory limit to SessionConfig and add RuntimeEnv::new_from_config(&config) or something.

@alamb
Copy link
Contributor

alamb commented Mar 11, 2026

Looks like there is a small conflict with this PR

@adriangb
Copy link
Contributor Author

Looks like there is a small conflict with this PR

I'll fix 😄

When no `--memory-limit` CLI flag is provided, `runtime_env_builder()`
now falls back to the `DATAFUSION_RUNTIME_MEMORY_LIMIT` environment
variable. This makes the memory pool configuration consistent with how
other `DATAFUSION_*` env vars are handled via `SessionConfig::from_env()`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@adriangb adriangb force-pushed the fix-dfbench-memory-limit-env-var branch from 1c8ec9b to 341bf3e Compare March 15, 2026 17:14
@adriangb
Copy link
Contributor Author

@alamb should we have a SessionContext::from_env that handles loading of both SessionConfig and and RuntimeEnv::from_env (new method we'd add) in one go?

@adriangb adriangb added this pull request to the merge queue Mar 15, 2026
Merged via the queue into apache:main with commit 1f59d32 Mar 15, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants