Skip to content

[Backlog]: Standardized Reporting and Severity Scoring for GenAI Red Teaming #31

@vishaljindal1990

Description

@vishaljindal1990

Checklist

  • Backlog entry requires creating new sandboxes.
  • Backlog entry requires creating new exploitation code and/or tutorials.

CVE List

No response

Description

While current exploitation modules demonstrate vulnerabilities, there is no standardized way to represent or communicate the severity and impact of findings.

This creates challenges in:

  • comparing results across different tools (e.g., garak, promptfoo, agent-based attacks)
  • prioritizing remediation
  • integrating findings into real-world workflows

Proposal

Define a standard reporting and scoring model for GenAI red teaming results:

  • Severity classification framework, including:

    • critical (e.g., sensitive data exfiltration, RCE via agent tools)
    • high (e.g., prompt injection leading to policy bypass)
    • medium/low (e.g., hallucination without direct impact)
  • Impact dimensions, such as:

    • confidentiality (PII leakage, secrets exposure)
    • integrity (prompt manipulation, data poisoning)
    • availability (system misuse, denial of service patterns)
  • Output format:

    • standardized report structure
    • compatible with evaluation framework outputs
  • Optional alignment with:

    • existing risk scoring models (e.g., CVSS-inspired approach for GenAI)

Value

  • Makes findings actionable and comparable
  • Enables integration with security workflows and dashboards
  • Provides a bridge between technical testing and risk management

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions