Skip to content

winui: window sizing rubric, screenshot validation, anti-self-delegation#84

Draft
nmetulev wants to merge 4 commits into
microsoft:stagingfrom
nmetulev:winui/window-sizing-and-screenshot-validation
Draft

winui: window sizing rubric, screenshot validation, anti-self-delegation#84
nmetulev wants to merge 4 commits into
microsoft:stagingfrom
nmetulev:winui/window-sizing-and-screenshot-validation

Conversation

@nmetulev
Copy link
Copy Markdown
Member

Three connected improvements to the WinUI dev skills, motivated by repeated failure modes I hit when building small apps.

1. winui-design — new Step 4: Size the Window to the App

WinUI 3 has no SizeToContent, so a Window that doesn't call AppWindow.Resize defaults to whatever Windows hands it (~1024×768) — which makes utilities and forms feel comically oversized vs. e.g. SwiftUI, which auto-fits.

Adds an 8-step reasoning rubric rather than a fixed lookup table:

  1. Inventory every row of the layout (don't just think about the "main" content)
  2. Estimate width from the widest row, not the average
  3. Sum heights row-by-row, including spacing
  4. Round up to the nearest 20 (rounding down is how content clips)
  5. Sanity-check against scale anchors (utility / form / multi-pane / editor)
  6. Compactness vs. clipping — pick clipping-free every time
  7. Aspect ratio follows the layout (don't default to landscape out of habit)
  8. Validate after running — capture a screenshot and look for clipped labels, cropped hero elements, right-edge truncation, footer overlap, unintended scrollbars; bump 40–80px and rebuild if any appear

Includes a worked example (focus timer → 460×860) and a named anti-pattern (the same app squeezed to 440×720, which clips "Long Break", crops the timer digits, and truncates the toggle label).

DPI-aware AppWindow.Resize snippet stays. Old "Step 4: Design Anti-Patterns" is renumbered to Step 5.

2. winui-ui-testing — new Step 3.5: Look at the Screenshots

Today the skill captures one final screenshot and never tells the agent to look at it. UIA assertions tell you whether an element exists and what its value is — they tell you nothing about how the app actually looks. Clipped labels, overlap, cramped layout, broken theming, off-screen flyouts: UIA happily reports PASS while the app is visually broken.

This change:

  • Adds a mandatory "Look at the Screenshots" step between "Run results" and "Fix and rerun".
  • Tells the agent to capture screenshots at multiple states (initial, post-interaction, per mode), not just at the end.
  • Tells the agent to view each PNG after the run (its view tool returns images as inspectable data).
  • Provides a concrete visual checklist (no truncation, hero elements fully visible, right-edge controls visible, no overlap, theming right, etc.).
  • Cross-references the sizing rubric for the most common fix (window too small → grow AppWindow.Resize).
  • Updates the test-script template to create a screenshots/ directory and capture intermediate states alongside the existing final shot.

3. winui-dev.agent.md — "Do The Work Yourself"

Repeatedly observed: a user explicitly invokes the winui-dev agent, and the agent's first action is to call the task tool to delegate the entire request to a fresh winui-dev sub-agent. This wastes a context window, hides progress from the user, and adds latency.

Adds an explicit prohibition against self-delegation at the top of the agent file, while still permitting narrow helpers (explore for unfamiliar codebases, rubber-duck for plan critique on non-trivial work). Includes a concrete check: "if you catch yourself about to call task with agent_type: winui:winui-dev, stop — you're the one who should be doing it."

Why bundle them

All three came out of the same end-to-end build session (a Pomodoro / focus timer app). Each one fixes a real failure mode I observed:

  • the agent delegated to itself → clutters context, hides progress
  • the resulting app launched at OS default size, then a follow-up landed too cramped → window sizing needs an explicit rubric
  • UIA tests passed while the cramped UI was visibly broken → needs screenshot review

Happy to split into three PRs if preferred.

nmetulev and others added 2 commits May 13, 2026 02:36
Three connected improvements to the WinUI dev skills, motivated by repeated failure modes when building small apps:

1. winui-design SKILL.md - new Step 4 'Size the Window to the App'
   WinUI 3 has no SizeToContent, so apps default to ~1024x768 regardless
   of content - which makes utilities feel oversized. Adds an 8-step
   reasoning rubric (inventory rows, derive widest-row width, sum heights,
   round up, sanity-check ranges, prefer compact-but-not-clipping,
   aspect-ratio follows content, validate after running) and a worked
   example. Old 'Step 4: Design Anti-Patterns' renumbered to Step 5.

2. winui-ui-testing SKILL.md - new Step 3.5 'Look at the Screenshots'
   UIA assertions can't see clipping, overlap, cramped layout, or theming
   bugs. Adds explicit guidance to capture screenshots at multiple states
   (initial, post-interaction, per mode), view them after each run, and
   apply a visual checklist. Test-script template now creates a
   screenshots/ directory and captures intermediate state, not just a
   single final shot.

3. winui-dev.agent.md - 'Do The Work Yourself' section
   Agents kept re-delegating user requests to a fresh winui-dev sub-agent,
   wasting context and hiding progress. Adds an explicit prohibition
   against self-delegation while still permitting narrow helpers (explore
   for unfamiliar codebases, rubber-duck for plan critique).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@nmetulev nmetulev marked this pull request as draft May 13, 2026 09:39
@nmetulev
Copy link
Copy Markdown
Member Author

🤖 Automated review — generated by GitHub Copilot CLI (Claude Opus 4.7) using the repo's pr-review skill, with a multi-model cross-check pass on GPT-5.4. Findings are advisory; please apply your own judgment.

Summary

Critical High Medium Low
0 1 4 1

Coverage

Dimension Result
skill-content ⚠ 3 findings (1 deduped into H1)
skill-tool-boundary ⚠ 3 findings
tool-correctness ⚠ 1 finding (H1)
payloads-and-tests ✓ clean (no payloads/tests touched)
docs-and-manifests ⚠ 1 finding
multi-model (gpt-5.4) ✓ 1/1 high cross-checked (confirmed, with a clarification + a 4th remediation option)

Findings

ID Location Domain One-liner
H1 plugins/winui/skills/winui-design/SKILL.md (DPI snippet, ~L100-113) tool-correctness, skill-content DPI-aware C# snippet uses PInvoke.User32.GetDpiForWindow — package not in default WinUI 3 template; verbatim copy-paste won't compile
M1 plugins/winui/skills/winui-design/SKILL.md Step 4 (the rubric outcome) skill-tool-boundary "Every app must call AppWindow.Resize" is Tier 1/2 enforceable — currently only lives as Tier 3 prose
M2 plugins/winui/skills/winui-design/SKILL.md Step 4 #8plugins/winui/skills/winui-ui-testing/SKILL.md Step 3.5 visual checklist skill-content, skill-tool-boundary The same visual-validation checklist (no , hero visible, right-edge controls, no overlap, scrollbars) is duplicated across the two new sections
M3 plugins/winui/skills/winui-ui-testing/SKILL.md:3 (frontmatter) skill-content description: still describes UIA-only testing; doesn't mention the now-mandatory screenshot/visual review
M4 plugins/winui/plugin.json:4, plugins/winui/.claude-plugin/plugin.json:4 docs-and-manifests User-visible default-skill / agent behavior changes shipped without a plugin version bump (still 0.3.0)
L1 plugins/winui/agents/winui-dev.agent.md:7-17 skill-tool-boundary Anti-self-delegation block is a reasonable Tier 3 fallback, but the cleanest fix (Tier 0: harness refuses self-targeted task calls) should be tracked upstream

Details

H1 — DPI snippet won't compile in a default WinUI 3 project

  • Severity: high · Confidence: high · Tier: 3
  • Files: plugins/winui/skills/winui-design/SKILL.md (DPI-aware code block at ~L100-113)
  • Multi-model: confirmed (with a clarification: using PInvoke; is not required because the snippet uses the fully-qualified name PInvoke.User32; the failure mode is the missing package reference, not the missing using directive)
var hwnd  = WinRT.Interop.WindowNative.GetWindowHandle(this);
var dpi   = (uint)PInvoke.User32.GetDpiForWindow(hwnd);   // ← compile error
var scale = dpi / 96.0;
AppWindow.Resize(new SizeInt32((int)(460 * scale), (int)(860 * scale)));

PInvoke.User32 is not part of the dotnet new winui-mvvm scaffolding documented in plugins/winui/skills/winui-dev-workflow/SKILL.md, so an agent following the rubric will hit The name 'PInvoke' does not exist in the current context. The follow-up prose ("Simpler fallback if PInvoke.User32 isn't available") acknowledges the dependency but doesn't tell the agent how to make it available.

Recommendation — pick one:

  1. Promote the fallback (AppWindow.Resize(new SizeInt32(460, 860));) as the primary snippet; keep the DPI-aware variant immediately below as an "if you need DPI scaling" addendum. Cleanest, but slightly weakens the educational goal.
  2. Add an explicit prerequisite line before the snippet: dotnet add package PInvoke.User32. (No using change needed since the snippet uses the fully-qualified name.)
  3. Use [LibraryImport] to declare GetDpiForWindow self-contained — works in default WinUI 3 projects targeting net8.0-windows10.x, no extra NuGet, but requires the containing class to be partial and add using System.Runtime.InteropServices;.
  4. Use CsWin32 — this repo's own plugins/winui/skills/winui-packaging/references/sourcegen-patterns.md:85-93 already prefers source-generated Win32 wrappers via CsWin32 for analogous cases, so this would be most consistent with existing conventions.

There is no built-in WinAppSDK helper that removes the DPI lookup entirely — AppWindow.Resize takes screen pixels, and XamlRoot.RasterizationScale exists but isn't reliably available in the constructor.


M1 — The "every app must size its window" outcome should also live at Tier 1

  • Severity: medium · Confidence: high · Tier: 3 (currently) → recommend Tier 1 + slim Tier 3
  • File: plugins/winui/skills/winui-design/SKILL.md Step 4 (entire new section)
  • Domain: skill-tool-boundary

The 8-step rubric mixes two kinds of guidance:

  • Genuinely Tier 3 (the reasoning): "inventory rows, take widest, sum heights, round up to 20, validate after running, derive your own from the layout." This is judgment work that doesn't generalize to a static rule and belongs in skill prose.
  • Tier 1 enforceable (the outcome): "Every new app must explicitly call AppWindow.Resize in MainWindow's constructor." That is a static, syntactically-detectable property of the source — exactly what the analyzer is for. The repo already reserves WUI2xxx for runtime/XAML pitfalls (src/tools/winui-analyzer/Microsoft.WindowsAppSDK.Analyzers/DiagnosticIds.cs:25-37), and BuildAndRun.ps1:105-145 already injects the analyzer into builds.

A senior maintainer would not bet shipping correctness on every agent reading 80 lines of prose to avoid the OS default 1024×768.

Recommendation:

  • Land the "missing AppWindow.Resize" outcome as a new WUI2xxx analyzer warning (with helpLinkUri pointing at this rubric).
  • Optionally also have BuildAndRun.ps1 emit a one-line advisory when it detects the pattern.
  • Keep the Tier 3 rubric, but slim it once Tier 1 is doing the enforcement.
  • Consider filing upstream against Microsoft.WindowsAppSDK.WinUI.CSharp.Templates so a new MainWindow.xaml.cs ships with a commented-out AppWindow.Resize(new SizeInt32(...)) template.

M2 — Visual-validation checklist is duplicated across both new sections

  • Severity: medium · Confidence: high · Tier: 3
  • Files: plugins/winui/skills/winui-design/SKILL.md (new Step 4 [WIP] Add important missing files for Microsoft projects #8 — "Validate after running") and plugins/winui/skills/winui-ui-testing/SKILL.md (new Step 3.5 visual checklist)
  • Domains: skill-content, skill-tool-boundary

Both sections enumerate substantially the same items: text ending in , hero elements clipped, right-edge controls cut off, overlapping rows, unintended scrollbars. Cross-skill duplication of a checklist costs context tokens twice and creates maintenance drift — when one list is updated, the other drifts.

Recommendation: Pick one home for the visual checklist. The screenshot review step in winui-ui-testing is the natural place (it's where the agent will be looking at pixels). In winui-design Step 4 #8, replace the bullet list with a one-line pointer: "After running, capture and view screenshots per winui-ui-testing Step 3.5; bump 40-80px on the affected axis if any visual symptom appears."


M3 — winui-ui-testing description: lags the new mandatory screenshot review

  • Severity: medium · Confidence: medium · Tier: 3
  • File: plugins/winui/skills/winui-ui-testing/SKILL.md:3 (frontmatter)
  • Domain: skill-content
description: "Automated UI testing for WinUI 3 apps — generate a batch test script, run all tests in one pass, read results."

The PR makes screenshot capture + visual review a mandatory step ("Don't skip this"), but the skill description still pitches the skill as UIA assertions only. Activation is keyword-driven — a user asking the agent to "verify the UI looks right" or "check for clipped labels" may not trigger this skill.

Recommendation: Append "screenshot capture and visual validation" (or similar) to the description: so trigger phrasing covers the new capability.


M4 — Plugin version not bumped

  • Severity: medium · Confidence: high
  • Files: plugins/winui/plugin.json:4, plugins/winui/.claude-plugin/plugin.json:4
  • Domain: docs-and-manifests

PR introduces three user-visible behavior changes to default-loaded skills + the orchestrator agent, but both manifests are still 0.3.0. Bump them in lockstep.


L1 — Anti-self-delegation: track the Tier 0 fix upstream

  • Severity: low · Confidence: medium · Tier: 3
  • File: plugins/winui/agents/winui-dev.agent.md:7-17
  • Domain: skill-tool-boundary

The new "Do The Work Yourself" prompt block is a pragmatic and reasonable Tier 3 fallback for a real failure mode, and given the upstream surface (Copilot CLI / Claude Code / Codex harnesses), it is probably the right pragmatic landing spot. But the cleaner long-term fix is Tier 0: the agent runtime should refuse or warn on task calls where agent_type matches the calling agent's own type. That protects every agent, not just winui-dev.

Recommendation: Keep the prompt guard. Additionally, file an upstream issue tracking self-delegation detection in the agent harness so the Tier 3 prose can eventually be retired.


Coverage notes

  • skill-content — Read all three changed files; verified factual claims (no SizeToContent in WinUI 3 ✓, ~1024×768 default ✓); applied generalization test to the rubric (passes — helps for settings utilities, list/detail dashboards, single-page form tools, not just Pomodoro). Three findings (one folded into H1, M2, M3).
  • skill-tool-boundary — Verified existing Tier 1 surfaces: analyzer WUI2xxx range, analyzer injection via BuildAndRun.ps1:105-145, winui-search, winapp ui screenshot. Three findings (M1, contributory to M2, L1).
  • tool-correctness — Only embedded snippets to review (no shipped code). Verified the C# snippet against default WinUI 3 templates and the PowerShell snippet against existing test-template idioms. One finding (H1).
  • payloads-and-tests — ✓ clean. No analyzer / winui-search / winmd-cli / BuildAndRun.ps1 / build-tools.ps1 / CI workflow changes; no payload refresh applicable; no analyzer rule added so no test required.
  • docs-and-manifests — Read both plugin.json files, top-level README, agent file, both touched skills' frontmatter, and grepped plugins/winui/ for orphaned Step 4 / Step 5 references after the renumbering (none found). One finding (M4).
  • multi-model (gpt-5.4) — Cross-checked H1 (confirmed); clarified that using PInvoke; is not required (the snippet uses the fully-qualified name); identified a 4th remediation option (CsWin32, which is what plugins/winui/skills/winui-packaging/references/sourcegen-patterns.md:85-93 already prefers in this repo). Verified winapp ui screenshot -a $AppPid -o ... is a valid CLI surface. No new critical/high findings.

@nmetulev nmetulev changed the base branch from main to staging May 14, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant