fix: handle missing minute-level data in get_higher_eq_freq_feature#2247
Open
Whning0513 wants to merge 1 commit into
Open
fix: handle missing minute-level data in get_higher_eq_freq_feature#2247Whning0513 wants to merge 1 commit into
Whning0513 wants to merge 1 commit into
Conversation
Fixes microsoft#1854 When running backtests with time_per_step='30min', the benchmark calculation in PortfolioMetrics calls get_higher_eq_freq_feature() with freq='30min'. The benchmark (CSI300) only has daily data, so D.features() raises ValueError/KeyError. The minute-level fallback branch hardcoded a fallback to '1min' without a try/except wrapper — if 1min data also doesn't exist (e.g., when only 30min daily data files are available), the ValueError propagates up with the misleading message 'can't find a freq from [Freq(30min)] that can resample to 1min!'. Fix: wrap the 1min fallback in the same try/except pattern used by the day/week/month branch, adding a 'day' fallback when 1min is unavailable. Also fix type inconsistency in Freq.get_recent_freq(): the first candidate was stored as str(_freq) while subsequent candidates stored the raw _freq (str|Freq), making the return type unpredictable.
|
@Whning0513 please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This was the most involved fix of the three I worked on. #1854 causes a backtest crash when
time_per_step='30min'is used — the error message saysValueError: can't find a freq from [Freq(30min)] that can resample to 1min!, which is confusing because the user never asked for 1min data.I traced the crash from the backtest engine down through the call chain.
PortfolioMetrics._cal_benchmark()callsD.features()with the 30min frequency. Since the CSI300 benchmark only has daily data, a ValueError is raised. The code then enters the minute-level fallback branch inget_higher_eq_freq_feature()(qlib/utils/resam.py), which hardcodes a retry withfreq='1min'— but this fallback is not wrapped in try/except, unlike the day-level fallback earlier in the same function. When 1min data also doesn't exist, the raw exception propagates up as the misleading error above.While reading the code, I also found a type inconsistency in
Freq.get_recent_freq()(qlib/utils/time.py). The method builds a list of candidate frequencies — the first entry was stored asstr(_freq)while subsequent entries kept the raw_freq(which could be eitherstrorFreq). This made the return type unpredictable for callers likeFileCalendarStorage._freq_file.The fix spans two files:
qlib/utils/resam.py— Added atry/except (ValueError, KeyError)around the1minfallback insideget_higher_eq_freq_feature(), matching the pattern already used by the day/week/month branch a few lines above. If 1min also fails, it falls through todayas the ultimate fallback.qlib/utils/time.py— Wrapped all candidates inFreq.get_recent_freq()consistently withFreq(_freq), so the return type is alwaysFreq | Noneregardless of which candidate wins.Would appreciate a review, thanks!