Skip to content

Enhance triage skills with updated scoring, confidence, and P-scale labels#6294

Open
lauren-ciha wants to merge 3 commits intomainfrom
user/laurenciha/weekly-sync-scoring
Open

Enhance triage skills with updated scoring, confidence, and P-scale labels#6294
lauren-ciha wants to merge 3 commits intomainfrom
user/laurenciha/weekly-sync-scoring

Conversation

@lauren-ciha
Copy link
Member

@lauren-ciha lauren-ciha commented Mar 12, 2026

This PR builds upon the issue-triage and feature-area-report skills.

The key changes are:

  • Added a Validate-FeatureAreaReport.ps1 that checks the report against the live GitHub label data using gh cli
  • Updated contacts.json to include a list of contacts rather than a primary/secondary contact
  • Updated scoring for issue priority to give further weight to community feedback
  • Updated "P-rating" priority rankings
  • Added confidence scoring to each part of the skill

…abels

Scoring System:
- New weights: reactions=30%, age=30%, comments=30%, severity=10%
- P-scale severity labels: P0=critical, P1=high, P2=medium, P3=low
- Confidence scoring with grep-friendly [confidence:XX] format (0-100)

Label Consolidation:
- Merged Hot + Popular into Popular (>=5 reactions threshold)

Contact Schema:
- Simplified from {primary, secondary} to single {contact} field
- Removed legacy schema backward compatibility

Area Suggestions:
- Get-IssueDetails.ps1 dynamically fetches area labels via Get-RepositoryLabels.ps1
- area-Notifications covers all notification types (toast, badge, push, wns)
- Fixed area-PowerManagement naming

Documentation:
- Added PowerShell examples alongside Bash for confidence filtering
- Updated all SKILL.md files with new configuration details
Copy link
Contributor

@guimafelipe guimafelipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is a script that we use internally, I don't think any of my comments should be blocking. But maybe they will help making this overall process more maintainable and consistent.

Feel free to take them as suggestions.


**Rationale**: Active discussions indicate ongoing relevance and potential blockers.

**Highlight Label**: `📈 Trending` when comments ≥ 5 AND recent activity (shows the issue is heating up NOW)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you removing the recent activity part?


Normalized Score = (Total Score / 100) × 100
Total Score = Reactions + Age + Comments + Severity
= 30 + 30 + 30 + 10 = 100 max
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to review that later. I feel like severity should have a way bigger impact on the score. If we have a security issue, we want to act on it ASAP regardless of reactions, comments and age.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to loop in @ssparach on this one. I'm open to updating the criteria.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity handling needs to be adjusted. Once severity is appropriately tracking actual security and data loss issues, its weight can be increased to reflect their higher priority

Comment on lines 23 to +27
reactions = 30
age = 25
comments = 20
severity = 15
blockers = 10
age = 30
comments = 30
severity = 10
blockers = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defined multiple times (on the reference markdown and here). Can we have it defined only once in a single source of truth that the script can read from?

Comment on lines +43 to +48
severityLabels = @{
critical = @("regression", "crash", "hang", "data-loss", "security", "P0")
high = @("bug", "P1")
medium = @("performance", "feature proposal", "feature-proposal", "P2")
low = @("documentation", "enhancement", "P3")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above comment. Having information duplicated multiple times can bloat the files, make it hard for humans to read and also for agents to maintain. If at some point there is a mismatch between the information, we don't know which one the agent will consider the right one.

Comment on lines +24 to +27
"critical": ["regression", "crash", "hang", "data-loss", "security", "P0"],
"high": ["bug", "P1"],
"medium": ["performance", "feature proposal", "feature-proposal", "P2"],
"low": ["documentation", "enhancement", "P3"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see all of the important data is defined in this json, which is good. Why do we need it hardcoded all over the script again? Can't we just read from the json?

Comment on lines +175 to +186
$defaultKeywords = @{
'area-Notifications' = @('notification', 'toast', 'badge', 'push', 'appnotification', 'pushnotification', 'wns')
'area-Packaging' = @('msix', 'package', 'deploy', 'install', 'appx', 'deployment')
'area-Windowing' = @('window', 'appwindow', 'titlebar', 'backdrop', 'presenter')
'area-Widgets' = @('widget', 'dashboard')
'area-AppLifecycle' = @('lifecycle', 'activation', 'restart', 'single instance', 'appinstance')
'area-PowerManagement' = @('power', 'battery', 'suspend', 'resume', 'powermanager')
'area-MRTCore' = @('resource', 'mrt', 'localization', 'pri', 'resourcemanager')
'area-DWriteCore' = @('font', 'dwrite', 'text', 'typography')
'area-AccessControl' = @('access', 'security', 'token', 'permission')
'area-Environment' = @('environment', 'variable', 'env')
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go in the json file too?

| None of above | 0 |
| Severity Tier | Labels | Score |
|---------------|--------|-------|
| Critical | `regression`, `crash`, `hang`, `data-loss`, `security`, `P0` | 10 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan to add labels for crash, hang, and data loss and some others? These labels do not exist today.
Also, regarding the security label, the skill is currently referencing area-security, which is intended to track bugs related to OAuth2Manager APIs. Severity falls outside that scope, so a different area label should be used here. Happy to align on which label makes the most sense. Thanks!

# area-Notifications covers all notification types (toast, badge, push, app notifications)
$defaultKeywords = @{
'area-Notifications' = @('notification', 'toast', 'badge', 'push', 'appnotification', 'pushnotification', 'wns')
'area-Packaging' = @('msix', 'package', 'deploy', 'install', 'appx', 'deployment')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keywords such as msix and appxdeployment would be more appropriate for the area-PackageManagement label.

'area-PowerManagement' = @('power', 'battery', 'suspend', 'resume', 'powermanager')
'area-MRTCore' = @('resource', 'mrt', 'localization', 'pri', 'resourcemanager')
'area-DWriteCore' = @('font', 'dwrite', 'text', 'typography')
'area-AccessControl' = @('access', 'security', 'token', 'permission')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the area-AccessControl and area-Environment labels do not currently exist. We may want to align on the appropriate existing area labels instead for keyword mapping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants