Skip to content

chore: separate metrics and telemetry#2056

Merged
yoks merged 4 commits into
NVIDIA:mainfrom
yoks:separate-metrics-and-telemetry
Jun 2, 2026
Merged

chore: separate metrics and telemetry#2056
yoks merged 4 commits into
NVIDIA:mainfrom
yoks:separate-metrics-and-telemetry

Conversation

@yoks
Copy link
Copy Markdown
Contributor

@yoks yoks commented Jun 1, 2026

Description

Splits Health service prometheus endpoint between /metrics and /telemetry.

Also removes backward compatability / endpoint for metrics, as it was confusing and abused via health probes.

This issue was raised after discussion on PR #1975

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Telemetry no longer exposed on /metrics
Default / endpoint no longer exposes metrics

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@yoks yoks requested review from a team as code owners June 1, 2026 21:55
@yoks yoks changed the title feat: separate metrics and telemetry chore: separate metrics and telemetry Jun 1, 2026
Comment thread crates/health/src/metrics.rs Outdated
Comment on lines +491 to +495
_ => Ok(Response::builder()
.status(http::StatusCode::OK)
.header(CONTENT_TYPE, "text/plain; charset=utf-8")
.body("not found; use /metrics, /telemetry, or /livez".to_string())
.expect("BUG: Response::builder error")),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the body text, should this return 404 instead of OK/200?

Copy link
Copy Markdown
Contributor Author

@yoks yoks Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, i explicitly made it Ok, so liveness probes which defaults to / would still work. This seems like a safer choice for existing envs which uses / for liveness/ready probes already.

@yoks yoks merged commit 7e085d7 into NVIDIA:main Jun 2, 2026
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants