fix: reuse filtered instruments across field groups by he-yufeng · Pull Request #2243 · microsoft/qlib

he-yufeng · 2026-05-31T08:26:27Z

QlibDataLoader resolves filter_pipe inside load_group_df(). With grouped field configs, load() calls load_group_df() once per fields group, so the same market string runs through D.instruments(..., filter_pipe=...) repeatedly.

This resolves string instruments once at the grouped-loader entry point and reuses that instrument config for each group. Direct load_group_df() calls keep the old behavior, and explicit non-string instrument lists still warn that filter_pipe is ignored.

To verify

python -m pytest tests\data_mid_layer_tests\test_dataloader.py -q -k grouped_loader
python -m py_compile qlib\data\dataset\loader.py tests\data_mid_layer_tests\test_dataloader.py
git diff --check

he-yufeng · 2026-06-04T21:03:29Z

Updated this with an Alpha158 handler-level regression test as well. The new test constructs Alpha158(..., filter_pipe=..., infer_processors=[], learn_processors=[]) and checks that D.instruments(..., filter_pipe=...) is still called once while feature and label groups are loaded separately.

Local checks:

python -m pytest tests/data_mid_layer_tests/test_dataloader.py::TestDataLoader::test_grouped_loader_applies_filter_pipe_once tests/data_mid_layer_tests/test_dataloader.py::TestDataLoader::test_alpha158_handler_applies_filter_pipe_once -q -> 2 passed
python -m py_compile qlib\data\dataset\loader.py tests\data_mid_layer_tests\test_dataloader.py
git diff --check

I also ran the whole tests/data_mid_layer_tests/test_dataloader.py file after installing the missing mlflow dependency. The two filter-pipe tests pass, but the pre-existing test_nested_data_loader still needs local Qlib market data under ~/.qlib/qlib_data/cn_data and fails before reaching this change.

he-yufeng · 2026-06-05T08:13:49Z

Thanks for the Alpha158 repro detail. The first version only moved D.instruments(...) out of the per-group loop, but D.features(...) still resolves stockpool configs through list_instruments(...) for each group, so feature and label could still apply the dynamic filter twice.

I pushed a follow-up commit that resolves a stockpool config with filter_pipe to the filtered instrument dict once per grouped loader/frequency, then reuses that dict for each group. The regression tests now assert that D.list_instruments(...) is called once and both feature/label groups receive the resolved instruments.

Local checks:

python -m pytest tests/data_mid_layer_tests/test_dataloader.py::TestDataLoader::test_grouped_loader_applies_filter_pipe_once tests/data_mid_layer_tests/test_dataloader.py::TestDataLoader::test_alpha158_handler_applies_filter_pipe_once -q -> 2 passed
python -m py_compile qlib\data\dataset\loader.py tests\data_mid_layer_tests\test_dataloader.py
git diff --check

I also reran the full tests/data_mid_layer_tests/test_dataloader.py; the two relevant tests pass, while the pre-existing test_nested_data_loader still fails locally because this Windows machine does not have usable day-frequency Qlib market data under C:\Users\He\.qlib\qlib_data\cn_data.

he-yufeng mentioned this pull request Jun 1, 2026

fix: cache filtered instruments during grouped loads #2237

Closed

5 tasks

fix: reuse filtered instruments across field groups

6fc21ee

he-yufeng force-pushed the fix/filter-pipe-once branch from 367138e to 6fc21ee Compare June 4, 2026 21:03

fix: prefilter grouped instruments once

04f72c0

he-yufeng mentioned this pull request Jun 5, 2026

[SeriesDFilter] Redundant execution of filter_pipe for each fields_group causes slow data loading #2236

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reuse filtered instruments across field groups#2243

fix: reuse filtered instruments across field groups#2243
he-yufeng wants to merge 2 commits into
microsoft:mainfrom
he-yufeng:fix/filter-pipe-once

he-yufeng commented May 31, 2026

Uh oh!

he-yufeng commented Jun 4, 2026

Uh oh!

he-yufeng commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

he-yufeng commented May 31, 2026

To verify

Uh oh!

he-yufeng commented Jun 4, 2026

Uh oh!

he-yufeng commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant