fix(prompts): update example JSON in quality-evaluator.md to match new VQ-01 philosophy#7393
Merged
Merged
Conversation
…w VQ-01 philosophy The example output and example weaknesses in quality-evaluator.md still treated "relying on default font sizes instead of explicit settings" as a VQ-01 deduction reason and a weakness. After #7391/#7392's wording changes that is no longer correct — defaults vs AI-tuned scores equally; what matters is the visual result. Updated both the JSON example (vq01 = 6/8 with a "title squeezes against the right edge" reason) and the weaknesses list to a proportional-sizing example, so the example matches the rubric the reviewer is actually supposed to apply.
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the example JSON output in prompts/quality-evaluator.md so the VQ-01 “Text Legibility” example reflects the newer rubric philosophy (judge the visual result, not whether defaults were used).
Changes:
- Updated the VQ-01 example note to focus on an oversized/squeezed title and a concrete adjustment suggestion.
- Updated the first example weakness to match the new VQ-01 reasoning (title fit/overflow rather than “defaults vs explicit settings”).
Comment on lines
98
to
101
| "visual_quality": { | ||
| "total": 23, | ||
| "vq01_text_legibility": {"score": 5, "max": 8, "note": "Readable but relying on defaults (font sizes not explicitly set)"}, | ||
| "vq01_text_legibility": {"score": 6, "max": 8, "note": "Title slightly oversized for content — fontsize=18pt squeezes against the right edge; reduce to ~14pt"}, | ||
| "vq02_no_overlap": {"score": 6, "max": 6, "note": "No overlap"}, |
MarkusNeusinger
added a commit
that referenced
this pull request
May 19, 2026
…7394) ## Summary Late-arriving Copilot review comments on #7389 / #7391 / #7393 — all substantive, all applied here. ## Fixes **High-impact** (would have caused generation failures or wrong sizing): - \`plot-generator.md\`: Output Files snippet still had \`dpi=300\` (matplotlib) and \`width=1600 scale=3\` (plotly) — Claude would have used those in new plots - \`library/plotnine.md\`: \`element_text\` used in snippet but missing from imports → \`NameError\` in generated code - \`library/highcharts.md\`: \"X-axis labels cut off\" pitfall recommended \`14px\` contradicting the new 12px default **Consistency**: - \`default-style-guide.md\`: Native-pixel column mixed pt/px/unitless under one cell → split per library with explanatory note - \`library/matplotlib.md\` + \`seaborn.md\`: \`ax.legend(...)\` was unconditional → wrapped in \`if len(...) > 1\` to avoid the \"No artists with labels\" warning on single-series plots - \`library/bokeh.md\`: added commented legend fontsize example - \`quality-evaluator.md\`: example JSON arithmetic was broken (sub-scores sum to 24 but total said 23 after #7393's score bump) — fixed total + top-level score ## Process note All these comments arrived AFTER the parent PRs auto-merged via \`--auto\`. Per CLAUDE.md PR Follow-Through, I should poll Copilot review explicitly before \`--auto\`-merging prompt/cosmetic PRs in the future. Adding to the process-gaps memo. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After #7391/#7392 changed the VQ-01 rubric (source-of-values irrelevant; defaults vs AI-tuned score equally), the example JSON output and example weaknesses in `quality-evaluator.md` were still showing the OLD philosophy:
These examples are the most concrete guidance for the reviewer; if they show old reasoning, the reviewer follows old reasoning regardless of what the rubric tables say.
Test plan
🤖 Generated with Claude Code