Harden QuickBook format#125
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughFixes a backslash-escape EOF boundary bug in the QuickBook parser. Adds Hypothesis property-based fuzz testing to validate parser safety, segment invariants, and translation round-trips. Introduces pytest-benchmark performance testing with synthetic document generation, peak memory profiling, and CI automation including baseline artifact tracking and local comparison support. Configures both testing frameworks as dev dependencies with environment-driven profile selection and pytest markers. ChangesQuickBook parser fuzz testing and benchmarking
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (1 warning, 2 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/utils/test_quickbook.py (1)
486-490: ⚡ Quick winAdd a deterministic regression test for the EOF trailing-backslash bracket case.
Line 486 fuzz corpus is broad, but it does not explicitly pin the exact
_find_bracket_endboundary that this PR hardens. A direct regression test here will keep this fix protected even if strategy weighting changes.Proposed diff
`@pytest.mark.fuzz` def test_fuzz_corpus_unclosed_brackets() -> None: data = "[" * 5000 _parse_qbk(data) assert _apply_translations(data, lambda s: s) == data +@pytest.mark.fuzz +def test_fuzz_corpus_trailing_escape_before_eof() -> None: + data = "[h2 trailing escape \\" + assert _find_bracket_end(data, 0) == -1 + assert _parse_qbk(data) == [] + assert _apply_translations(data, lambda s: s) == data + + `@pytest.mark.fuzz` def test_fuzz_corpus_section_depth_beyond_cap() -> None: data = "[section " + "[nested\n" * 15 + "body\n" * 15 _parse_qbk(data)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/utils/test_quickbook.py` around lines 486 - 490, The current test_fuzz_corpus_unclosed_brackets is a broad fuzz test that does not explicitly cover the specific boundary condition for EOF trailing-backslash bracket cases that the PR hardens. Add a new deterministic regression test function that directly tests the exact scenario where _find_bracket_end encounters a backslash followed by an unclosed bracket at the end of file. This targeted test will ensure the specific fix for this boundary condition in _find_bracket_end remains protected even if the fuzz test strategy changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/utils/test_quickbook.py`:
- Around line 486-490: The current test_fuzz_corpus_unclosed_brackets is a broad
fuzz test that does not explicitly cover the specific boundary condition for EOF
trailing-backslash bracket cases that the PR hardens. Add a new deterministic
regression test function that directly tests the exact scenario where
_find_bracket_end encounters a backslash followed by an unclosed bracket at the
end of file. This targeted test will ensure the specific fix for this boundary
condition in _find_bracket_end remains protected even if the fuzz test strategy
changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c2ce18da-4ca9-4fe7-9442-703165ae3f0a
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
.github/workflows/ci-test.ymlpyproject.tomlsrc/boost_weblate/utils/quickbook.pytests/conftest.pytests/utils/test_quickbook.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.github/workflows/ci-benchmark.yml (1)
51-83: ⚡ Quick winPublish the benchmark table to the GitHub job summary, not only logs.
The script builds markdown output but currently only prints to stdout. Writing it to
$GITHUB_STEP_SUMMARYmakes results immediately visible in the Actions UI.♻️ Suggested patch
- name: Benchmark summary if: always() run: | - uv run --group dev python - <<'PY' + uv run --group dev python - <<'PY' import json + import os from pathlib import Path path = Path("benchmark-results.json") if not path.is_file(): print("benchmark-results.json not found; skipping summary") raise SystemExit(0) data = json.loads(path.read_text(encoding="utf-8")) - print("## QuickBook parser benchmark summary") - print() - print("| Benchmark | Mean (s) | Min (s) | Max (s) |") - print("|-----------|----------|---------|---------|") + lines = [] + lines.append("## QuickBook parser benchmark summary") + lines.append("") + lines.append("| Benchmark | Mean (s) | Min (s) | Max (s) |") + lines.append("|-----------|----------|---------|---------|") for bench in data.get("benchmarks", []): name = bench.get("name", "?") stats = bench.get("stats", {}) mean_s = stats.get("mean", 0.0) min_s = stats.get("min", 0.0) max_s = stats.get("max", 0.0) extra = bench.get("extra_info", {}) if extra.get("peak_mib") is not None: - print( - f"| {name} (peak memory) | {extra['peak_mib']} MiB | — | — |" - ) + lines.append( + f"| {name} (peak memory) | {extra['peak_mib']} MiB | — | — |" + ) continue - print( - f"| {name} | {mean_s:.4f} | {min_s:.4f} | {max_s:.4f} |" - ) + lines.append(f"| {name} | {mean_s:.4f} | {min_s:.4f} | {max_s:.4f} |") + + summary = "\n".join(lines) + print(summary) + step_summary = os.environ.get("GITHUB_STEP_SUMMARY") + if step_summary: + Path(step_summary).write_text(summary + "\n", encoding="utf-8") PY🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/ci-benchmark.yml around lines 51 - 83, The benchmark summary script prints markdown output only to stdout, making it visible only in the workflow logs rather than the GitHub Actions job summary. Modify the Python script within the PY heredoc block to write the output to the file specified by the $GITHUB_STEP_SUMMARY environment variable. Instead of using print statements throughout the script, capture or redirect all the markdown output (the header, table headers, and table rows) to write to the file path obtained from the GITHUB_STEP_SUMMARY environment variable. This will make the benchmark results immediately visible in the Actions UI.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In @.github/workflows/ci-benchmark.yml:
- Around line 51-83: The benchmark summary script prints markdown output only to
stdout, making it visible only in the workflow logs rather than the GitHub
Actions job summary. Modify the Python script within the PY heredoc block to
write the output to the file specified by the $GITHUB_STEP_SUMMARY environment
variable. Instead of using print statements throughout the script, capture or
redirect all the markdown output (the header, table headers, and table rows) to
write to the file path obtained from the GITHUB_STEP_SUMMARY environment
variable. This will make the benchmark results immediately visible in the
Actions UI.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1a2b59da-7a6f-43c1-97f7-fef9581e49ab
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
.benchmarks/Linux-CPython-3.14-64bit/0001_baseline.json.github/WORKFLOWS.md.github/workflows/ci-benchmark.yml.github/workflows/ci.yml.gitignoreREADME.mdpyproject.tomltests/utils/test_quickbook.py
✅ Files skipped from review due to trivial changes (2)
- .github/workflows/ci.yml
- .github/WORKFLOWS.md
🚧 Files skipped from review as they are similar to previous changes (2)
- pyproject.toml
- tests/utils/test_quickbook.py
|
@coderabbitai, full review. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
echo "=== quickbook.py ==="
cat src/boost_weblate/utils/quickbook.py
echo ""
echo "=== tests/utils/test_quickbook.py ==="
cat tests/utils/test_quickbook.py
echo ""
echo "=== tests/conftest.py ==="
cat tests/conftest.py
echo ""
echo "=== pyproject.toml ==="
cat pyproject.toml
echo ""
echo "=== .github/workflows/ci-test.yml ==="
cat .github/workflows/ci-test.yml
echo ""
echo "=== .github/workflows/ci-benchmark.yml ==="
cat .github/workflows/ci-benchmark.yml
echo ""
echo "=== .github/workflows/ci.yml ==="
cat .github/workflows/ci.yml
echo ""
echo "=== .gitignore ==="
cat .gitignore
echo ""
echo "=== .github/WORKFLOWS.md ==="
cat .github/WORKFLOWS.mdLength of output: 69090
Review of PR
|
| Severity | File | Finding |
|---|---|---|
| ✅ Correct | quickbook.py |
EOF fix in _find_bracket_end is sound |
test_quickbook.py |
Hardcoded _PEAK_MEMORY_LIMIT_BYTES |
|
ci-benchmark.yml |
Default 20% comparison threshold on free runners | |
| Minor | ci.yml |
benchmark should needs: [test] |
| Minor | test_quickbook.py |
Redundant len(result) <= len(data) assertion |
| Minor | pyproject.toml |
fuzz not excluded from default run |
| Nit | test_quickbook.py |
sys.path duplication should be commented |
The core bug fix and overall test structure are solid. The Hypothesis strategy composition is well-done and the corpus tests cover real-world edge cases. The main actionable items are the benchmark comparison threshold and the needs: guard in ci.yml.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
386-416:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMake the benchmark threshold wording consistent.
The prose still says
default 20% mean, but the command and CI note usemean:30%. That mismatch will confuse anyone trying to reproduce or refresh the baseline.♻️ Suggested fix
-Compare against the committed baseline and fail on regression (default 20% mean): +Compare against the committed baseline and fail on regression (default 30% mean):🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 386 - 416, The prose text in the benchmark section contains a mismatch with the actual command configuration. The opening line states "default 20% mean" in parentheses, but the bash command immediately below uses `--benchmark-compare-fail=mean:30%`, and the CI configuration note also references `mean:30%`. Update the parenthetical text that says "default 20% mean" to "default 30% mean" to match the actual threshold values used in the commands and CI configuration throughout this section.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@README.md`:
- Around line 386-416: The prose text in the benchmark section contains a
mismatch with the actual command configuration. The opening line states "default
20% mean" in parentheses, but the bash command immediately below uses
`--benchmark-compare-fail=mean:30%`, and the CI configuration note also
references `mean:30%`. Update the parenthetical text that says "default 20%
mean" to "default 30% mean" to match the actual threshold values used in the
commands and CI configuration throughout this section.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4a519e53-f5bc-4349-9332-3880ff62f029
📒 Files selected for processing (7)
.github/WORKFLOWS.md.github/workflows/ci-benchmark.yml.github/workflows/ci-test.yml.github/workflows/ci.ymlREADME.mdpyproject.tomltests/utils/test_quickbook.py
✅ Files skipped from review due to trivial changes (1)
- .github/WORKFLOWS.md
🚧 Files skipped from review as they are similar to previous changes (5)
- .github/workflows/ci-test.yml
- .github/workflows/ci.yml
- pyproject.toml
- .github/workflows/ci-benchmark.yml
- tests/utils/test_quickbook.py
|
@coderabbitai, resume. |
|
✅ Action performedReviews resumed. |
Close #115, close #116.
Summary by CodeRabbit
Release Notes
Bug Fixes
New Features
Tests
Documentation