Skip to content

feat: add score command#2648

Open
adamaltman wants to merge 15 commits intomainfrom
aa/api-score
Open

feat: add score command#2648
adamaltman wants to merge 15 commits intomainfrom
aa/api-score

Conversation

@adamaltman
Copy link
Copy Markdown
Member

What/Why/How?

What: Adds a new score command to Redocly CLI that analyzes OpenAPI 3.x descriptions and produces two composite scores: Integration Simplicity (0-100, how easy is this API to integrate) and Agent Readiness (0-100, how usable is this API by AI agents/LLM tooling).

Why: API producers currently lack a quick, deterministic way to assess how developer-friendly or AI-agent-friendly their API descriptions are. The existing stats command counts structural elements but doesn't evaluate quality signals like documentation coverage, example presence, schema complexity, or error response structure. This command fills that gap with actionable, explainable scores.

How: The implementation follows the same pattern as the stats command (bundle + analyze), with a clean separation between metric collection and score calculation:

  • Metric collection (collectors/): Walks the bundled document, resolving internal $refs, to gather per-operation raw metrics (parameter counts, schema depth, polymorphism, description/constraint/example coverage, structured error responses, workflow dependency depth via shared schema refs).
  • Scoring (scoring.ts): Pure functions that normalize raw metrics into subscores and compute weighted composite scores. Thresholds and weights are configurable constants. anyOf is penalized more heavily than oneOf/allOf; discriminator presence improves the agent readiness polymorphism clarity subscore.
  • Hotspots (hotspots.ts): Identifies the operations with the most issues, sorted by number of reasons, with human-readable explanations.
  • Output: --format=stylish (default, with color bar charts) and --format=json (machine-readable for CI/dashboards).

Reference

Related to API governance and developer experience tooling. No existing issue -- this is a new feature.

Testing

  • 43 unit tests across 3 test files covering schema depth calculation, $ref resolution, polymorphism counting (oneOf/anyOf/allOf), constraint detection (including const), example coverage scoring, anyOf penalty multiplier, discriminator impact on agent readiness, deterministic output, and score range validation.
  • Manually tested against three real OpenAPI descriptions (Redocly Cafe: 12 operations, Reunite Main: 299 operations, Rebilly: 606 operations) to verify scores are reasonable and hotspot reasoning is actionable.
  • TypeScript compiles cleanly (tsc --noEmit), all existing tests continue to pass.

Screenshots (optional)

Stylish output for Redocly Cafe:

  Scores

  Integration Simplicity:  85.3/100
  Agent Readiness:         94.4/100

  Integration Simplicity Subscores

  Parameter Simplicity     [█████████████████░░░] 83%
  Schema Simplicity        [████████████░░░░░░░░] 62%
  Documentation Quality    [███████████████████░] 97%
  Constraint Clarity       [███████████████████░] 96%
  Example Coverage         [██████████████████░░] 92%
  Error Clarity            [████████████████████] 100%
  Workflow Clarity         [█████████████████░░░] 83%

  Top 4 Hotspot Operations

  GET /orders (listOrders)
    Integration Simplicity: 69.1  Agent Readiness: 93.9
    - High parameter count (6)
    - Deep schema nesting (depth 5)

  PATCH /orders/{orderId} (updateOrder)
    Integration Simplicity: 77.6  Agent Readiness: 85.3
    - Missing request body examples

Check yourself

  • Code changed? - Tested with Redoc/Realm/Reunite (internal)
  • All new/updated code is covered by tests
  • New package installed? - Tested in different environments (browser/node)
  • Documentation update considered

Security

  • The security impact of the change has been considered
  • Code follows company security practices and guidelines

@adamaltman adamaltman requested review from a team as code owners March 12, 2026 01:21
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 12, 2026

🦋 Changeset detected

Latest commit: d6175e2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@redocly/cli Minor
@redocly/openapi-core Minor
@redocly/respect-core Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 79.64% (🎯 79%) 7060 / 8864
🔵 Statements 78.98% (🎯 78%) 7315 / 9261
🔵 Functions 83.19% (🎯 82%) 1406 / 1690
🔵 Branches 70.99% (🎯 71%) 4742 / 6679
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
packages/cli/src/commands/score/collect-metrics.ts 55.97% 45.6% 72.72% 60.33% 26, 80-82, 85-87, 90-92, 95, 108-111, 133, 175-192, 209, 234-235, 242-294
packages/cli/src/commands/score/constants.ts 100% 100% 100% 100%
packages/cli/src/commands/score/hotspots.ts 100% 97.67% 100% 100%
packages/cli/src/commands/score/index.ts 96.42% 72.72% 100% 96.42% 125
packages/cli/src/commands/score/scoring.ts 99.01% 92.15% 100% 100% 282
packages/cli/src/commands/score/collectors/dependency-graph.ts 100% 90% 100% 100%
packages/cli/src/commands/score/collectors/document-metrics.ts 91.82% 74.57% 100% 95.48% 121, 127-129, 149-155, 318, 332, 354, 385, 409-419, 450
packages/cli/src/commands/score/formatters/json.ts 100% 100% 100% 100%
packages/cli/src/commands/score/formatters/stylish.ts 58.65% 29.82% 70% 60.24% 10, 78, 79, 107, 146-268
Generated in workflow #9207 for commit d6175e2 by the Vitest Coverage Report Action

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

CLI Version Mean Time ± Std Dev (s) Relative Performance (Lower is Faster)
cli-latest 3.483s ± 0.019s ▓ 1.00x
cli-next 3.474s ± 0.020s ▓ 1.00x (Fastest)

- Add "AI" before "agent readiness" in changeset and docs for clarity
- Replace <pre> block with fenced code block in score.md
- Add security scheme coverage to metrics documentation
- Remove resolveIfRef helper, replace with resolveNode that falls back
  to the original node when resolution fails
- Refactor to use walkDocument visitor approach (matching stats command
  pattern) instead of manually iterating the document tree
- Use resolveDocument + normalizeVisitors + walkDocument from
  openapi-core for proper $ref resolution and spec-format extensibility
- Update index.test.ts to mock the new walk infrastructure

Made-with: Cursor
Co-authored-by: Jacek Łękawa <164185257+JLekawa@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@tatomyr tatomyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments. I haven'd fully reviewed the scoring and collectors though as it takes time.

Copy link
Copy Markdown
Collaborator

@tatomyr tatomyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found some dead code. Please check if it's needed and remove if not. Also, I'm not sure what tests to review as many appear to only test that dead code, so let's handle that first.

- Inline collect-metrics.ts test helper into document-metrics.test.ts
- Use parseYaml as yaml directly instead of wrapper function
- Remove default parameter from getStylishOutput in formatter tests
- Use getMajorSpecVersion + exitWithError for spec version check
- Add explicit case 'stylish' before default in format switch
- Remove unsupported 'markdown' from score command format choices
- Add comment explaining depth=-1 initialization
- Clarify anyOf penalty and dependency terminology in docs
- Update non-oas3 rejection test for exitWithError (throws)

Made-with: Cursor
- Add type cast for parseYaml (returns unknown) in document-metrics tests
- Inline collectDocumentMetrics helper into example-coverage tests
- Remove collect-metrics.js import from scoring tests, use direct metrics
- Add missing debugLogs property to accumulator mock in index tests

Made-with: Cursor
Extract the metric-collection pipeline from handleScore into a
standalone collect-metrics.ts module with two exports:
- collectMetrics(): low-level function used by handleScore
- collectDocumentMetrics(): high-level convenience used by tests

This eliminates ~300 lines of duplicated walker setup across three
test files (document-metrics, example-coverage, index) and ensures
tests exercise the same code path as the production command.

Add $ref-keyed memoization to walkSchema so repeated references to
the same component schema return cached stats instead of re-walking.
Stripe API: 37.6s → 11.3s (3.3× faster).

Made-with: Cursor
Add median alongside averages for parameters, schema depth,
polymorphism, and properties in the stylish output. Rename
the misleading "Avg max schema depth" to "Schema depth".

Made-with: Cursor
Rename workflowClarity to dependencyClarity, workflowDepths to
dependencyDepths, and workflow-graph to dependency-graph to align
code naming with the "Dependency Clarity" display label. Also adds
discoverability subscore, recursive composition keyword stripping
for accurate property counting, and updates e2e snapshots.

Made-with: Cursor
}

describe('printScoreStylish', () => {
it('outputs scores, subscores, metrics, and hotspots', () => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of mocking and presetting, it's better to simply create e2e tests for that cases.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this, but:

ERROR: Coverage for branches (70.99%) does not meet global threshold (71%)
Error: Process completed with exit code 1.

Copy link
Copy Markdown
Collaborator

@tatomyr tatomyr Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a problem. You can lower the threshold in vitest.config.ts. It's expected that we only cover with unit tests only what's reasonable.

- Use isPlainObject from openapi-core for schema cycle detection
  instead of raw typeof checks that match arrays
- Remove formatter unit tests in favor of e2e snapshot coverage
- Clarify makeScores() purpose in hotspot tests

Made-with: Cursor
@adamaltman adamaltman requested a review from tatomyr March 30, 2026 16:33
# `score`

## Introduction

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{% admonition type="warning" name="Important" %}
The `score` command is considered an experimental feature. This means it's still a work in progress and may go through major changes.
The `score` command supports OpenAPI 3.x descriptions only.
{% /admonition %}

I suggest adding the 'experimental' admonition similar to the join command.

Comment on lines +12 to +15
{% admonition type="warning" name="OpenAPI 3.x only" %}
The `score` command supports OpenAPI 3.0, 3.1, and 3.2 descriptions.
OpenAPI 2.0 (Swagger) and AsyncAPI are not currently supported.
{% /admonition %}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{% admonition type="warning" name="OpenAPI 3.x only" %}
The `score` command supports OpenAPI 3.0, 3.1, and 3.2 descriptions.
OpenAPI 2.0 (Swagger) and AsyncAPI are not currently supported.
{% /admonition %}


```bash
redocly score <api>
redocly score <api> [--format=<option>] [--config=<path>]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
redocly score <api> [--format=<option>] [--config=<path>]
redocly score <api> [--format=<option>]

I don't think we explicitly use any redocly.yaml configuration in this command, so for me it doesn't make too much sense to list it in the examples.


describe('handleScore', () => {
beforeEach(() => {
mockOutput.mockClear();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant as we configured restoreMocks: true in vitest config.

Suggested change
mockOutput.mockClear();

metrics: makeDocumentMetrics(new Map([['listItems', makeTestMetrics()]])),
debugLogs: [],
});
process.exitCode = undefined;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems redundant. At least, I don't see where we use exitCode in the tests.

}
},
},
Paths: {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for? Could PathItems reside outside the Paths?

Array.from(result.rawMetrics.operations.entries()).map(([key, m]) => [
key,
{
path: m.path,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this mapping useful for? Why not simply use m or { ...m }? Are you trying to clean some properties?

.filter((v) => typeof v === 'string' && v.startsWith('#/'))
.map((v) => ({ $ref: v }))
: null;
const hasDiscriminatorBranches = discriminatorRefs != null && discriminatorRefs.length > 0;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const hasDiscriminatorBranches = discriminatorRefs != null && discriminatorRefs.length > 0;
const hasDiscriminatorBranches = Array.isArray(discriminatorRefs) && discriminatorRefs.length > 0;

Comment on lines +268 to +269
for (let i = 1; i < discriminatorRefs!.length; i++) {
const branchStats = walkSchema(discriminatorRefs![i], debug);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (let i = 1; i < discriminatorRefs!.length; i++) {
const branchStats = walkSchema(discriminatorRefs![i], debug);
for (let i = 1; i < discriminatorRefs.length; i++) {
const branchStats = walkSchema(discriminatorRefs[i], debug);

I don't think you need those asserts.

}

const localPropertyNames: string[] = [];
if (schema.properties && typeof schema.properties === 'object') {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (schema.properties && typeof schema.properties === 'object') {
if (isPlainObject(schema.properties)) {


function hasExample(mediaType: { example?: unknown; examples?: Record<string, unknown> }): boolean {
if (mediaType.example !== undefined) return true;
if (mediaType.examples && typeof mediaType.examples === 'object') {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (mediaType.examples && typeof mediaType.examples === 'object') {
if (isNotEmptyObject(mediaType.examples)) {


const walkSchema = (schemaNode: any, debug = false): SchemaStats => {
let resolved = schemaNode;
const ref: string | undefined =
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to reuse the isRef utility function here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants