Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds comprehensive Prometheus metrics collection capabilities to the Pyth Observer system. The implementation includes metrics for price feeds, publisher states, API performance, alerts, and system health monitoring.
- Introduces a centralized
PythObserverMetricsclass with 15+ metric types covering all aspects of the observer system - Replaces existing simple gauge metrics in dispatch.py with comprehensive metric collection and success rate tracking
- Adds instrumentation throughout the observer lifecycle to capture API request timings, error rates, and system status
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| pyth_observer/metrics.py | Defines the complete metrics collection system with gauges, counters, and histograms for all observer operations |
| pyth_observer/dispatch.py | Replaces basic gauge metrics with comprehensive check execution timing and success rate tracking |
| pyth_observer/init.py | Adds metrics instrumentation to API calls, price feed processing, and error handling in the main observer loop |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
pyth_observer/__init__.py
Outdated
| metrics.loop_errors_total.labels(error_type=type(e).__name__).inc() | ||
|
|
||
| logger.debug("Sleeping...") | ||
| metrics.observer_ready = 0 |
There was a problem hiding this comment.
The metrics.observer_ready is set to 0 twice in a row (lines 208 and 211), which is redundant and likely indicates a copy-paste error.
| metrics.observer_ready = 0 |
pyth_observer/metrics.py
Outdated
| self.observer_up = 1 | ||
| self.observer_ready = 0 |
There was a problem hiding this comment.
Setting instance attributes observer_up and observer_ready on the metrics class will not update the actual Prometheus metrics. These should use the gauge's .set() method instead: self.observer_ready.set(0)
| self.observer_up = 1 | |
| self.observer_ready = 0 | |
| self.observer_up.set(1) | |
| self.observer_ready.set(0) |
| # global states | ||
| states = [] | ||
| while True: | ||
| try: |
There was a problem hiding this comment.
The states list is initialized inside the while loop scope but declared outside it. This will cause the list to accumulate data across iterations, leading to memory growth and incorrect metrics. Move this initialization inside the while loop.
| # global states | |
| states = [] | |
| while True: | |
| try: | |
| while True: | |
| try: | |
| states = [] |
There was a problem hiding this comment.
no it won't. L107 will clear it
| self.alerts_active.labels(alert_type=alert_type).set(count) | ||
|
|
||
| if sent_alert: | ||
| alert_type = sent_alert.split("-")[0] |
There was a problem hiding this comment.
The code assumes sent_alert contains a dash character without validation. If sent_alert doesn't contain a dash, split('-')[0] will return the entire string, but this could lead to unexpected behavior. Consider adding validation or using a more robust parsing method.
| alert_type = sent_alert.split("-")[0] | |
| if "-" in sent_alert: | |
| alert_type = sent_alert.split("-", 1)[0] | |
| else: | |
| alert_type = sent_alert # or use "unknown" if preferred |
| metrics.alerts_sent_total.labels( | ||
| alert_type=info["type"], | ||
| channel=event_type.lower().replace("event", ""), | ||
| ).inc() |
There was a problem hiding this comment.
The variable event_type is not defined in this scope. This will cause a NameError when this code path is executed.
There was a problem hiding this comment.
umm i think it is tho
Add metrics