feat(observability): uptime, Prometheus /metrics, and OTLP traces across all servers#150
Open
mauripunzueta wants to merge 1 commit into
Open
feat(observability): uptime, Prometheus /metrics, and OTLP traces across all servers#150mauripunzueta wants to merge 1 commit into
mauripunzueta wants to merge 1 commit into
Conversation
…oss all servers Add a shared `helios-observability` crate and wire it into all four server binaries (hfs, hts, sof-server, fhirpath-server): - `GET /metrics` Prometheus endpoint (maintained metrics + metrics-exporter-prometheus stack; avoids unmaintained opentelemetry-prometheus / its protobuf RUSTSEC advisory). - `/health` enriched with `uptime_seconds` + `started_at`. - Per-request `http_requests_total` / `http_request_duration_seconds` metrics (templated `route` label; `service` global label) and a tracing span. - Feature-gated (`otel`) OTLP trace export via tracing-opentelemetry; OTLP metrics are produced out-of-process by a Collector scraping /metrics. Tenant is recorded as a span attribute only, never a metric label, to bound Prometheus cardinality. Verified end-to-end against fhirpath-server.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a shared
helios-observabilitycrate and wires it into all four serverbinaries (
hfs,hts,sof-server,fhirpath-server):GET /metricsPrometheus endpoint —http_requests_total,http_request_duration_seconds,uptime_seconds, with aservicegloballabel and a templated
routelabel./healthenriched withuptime_seconds+started_at.metric label (cardinality).
otel) OTLP trace export viatracing-opentelemetry(opentelemetry 0.32). OTLP metrics are produced out-of-process by a Collector
scraping
/metrics— we avoid the unmaintainedopentelemetry-prometheus(protobuf RUSTSEC advisory).
Verification
cargo check(default) compiles; observability also compiles with--features otel. Clippy clean (CI flags) for touched crates. Unit tests pass.fhirpath-server:/healthuptime +/metricscounter/gauge/histogram confirmed.
Notes / known gaps in local verification
under the Windows toolchain (environment limitation, not a code issue). The
stateless
fhirpath-serverexercised the same wiring successfully.--all-featureslocally (build window). A pre-existingdead_codewarning onbuild_embedded_job_storeappears only under defaultfeatures; CI's
--all-featuresrun won't hit it. CI is the authoritative gate.