Docker: Grid endpoint /metrics for exporter or K8s ServiceMonitor#3135
Docker: Grid endpoint /metrics for exporter or K8s ServiceMonitor#3135VietND96 wants to merge 1 commit into
Conversation
Review Summary by QodoImplement built-in Prometheus exporter and comprehensive monitoring dashboards for Selenium Grid
WalkthroughsDescription• **Implemented built-in Prometheus exporter for Selenium Grid** written in Go that collects 20+ metrics covering grid health, node status, sessions, and queue depth • **Migrated from standalone exporter deployment to built-in model** integrated directly into Hub and Router containers, eliminating separate exporter pods • **Added comprehensive Grafana dashboards** (5 new dashboards) for monitoring grid overview, sessions, cross-browser testing, node health, and queue/capacity metrics • **Updated Kubernetes/Helm integration** with ServiceMonitor and PodMonitor for Prometheus scraping, automatic dashboard provisioning via ConfigMaps, and metrics port exposure on Hub/Router services • **Added Docker Compose monitoring stack** with pre-configured Prometheus and Grafana for local development and testing • **Updated build system** with new targets for compiling the Go exporter and copying dashboards into Helm charts • **Fixed Traefik ServersTransport condition** to properly check for Traefik ingress class • **Updated test suite** to validate ServiceMonitor configuration, Grafana dashboard provisioning, and metrics port exposure Diagramflowchart LR
A["Selenium Hub/Router"] -->|exposes metrics| B["Port 9615"]
B -->|scraped by| C["Prometheus"]
C -->|queries| D["Grafana Dashboards"]
E["Go Exporter"] -->|built into| A
F["GraphQL Client"] -->|queries Grid| A
G["Collector"] -->|emits metrics| E
D -->|displays| H["Grid Health<br/>Sessions<br/>Queue<br/>Nodes"]
File Changes1. Router/start-selenium-grid-router.sh
|
Code Review by Qodo
1. Prometheus/Grafana images use latest
|
CI Feedback 🧐(Feedback updated until commit 66a7d33)A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
| prometheus: | ||
| image: prom/prometheus:latest | ||
| container_name: prometheus | ||
| ports: | ||
| - "9090:9090" | ||
| volumes: | ||
| - ./.monitoring/config/prometheus.yml:/etc/prometheus/prometheus.yml:ro | ||
| command: | ||
| - --config.file=/etc/prometheus/prometheus.yml | ||
| - --storage.tsdb.retention.time=7d | ||
| depends_on: | ||
| - selenium-hub | ||
|
|
||
| grafana: | ||
| image: grafana/grafana:latest | ||
| container_name: grafana |
There was a problem hiding this comment.
1. Prometheus/grafana images use latest 📘 Rule violation ⛨ Security
docker-compose-v3-monitoring.yml pins Prometheus and Grafana to :latest, which is an implicit floating version and can change over time. This reduces build/deployment reproducibility and increases supply-chain risk.
Agent Prompt
## Issue description
`docker-compose-v3-monitoring.yml` uses `prom/prometheus:latest` and `grafana/grafana:latest`, which violates the requirement to avoid implicit `latest` artifacts.
## Issue Context
The compliance checklist requires pinned versions (and ideally integrity verification). For container images, pinning to a specific version tag and/or digest improves reproducibility and reduces supply-chain risk.
## Fix Focus Areas
- docker-compose-v3-monitoring.yml[39-54]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| /opt/bin/selenium-grid-exporter & | ||
|
|
There was a problem hiding this comment.
2. Exporter started twice 🐞 Bug ☼ Reliability
Hub/Router start scripts background-launch /opt/bin/selenium-grid-exporter while supervisord also autostarts the same binary. This will cause a deterministic port 9615 bind conflict and unstable exporter behavior (one instance immediately fails), breaking /metrics availability.
Agent Prompt
## Issue description
The exporter is launched twice (in the start script and as a supervisord program), which will race to bind port 9615 and cause one process to fail.
## Issue Context
The Hub/Router images run `start-selenium-grid-*.sh` under supervisord, and the PR also adds a dedicated supervisord program for the exporter.
## Fix Focus Areas
- Hub/start-selenium-grid-hub.sh[202-206]
- Router/start-selenium-grid-router.sh[195-199]
- Hub/selenium-grid-hub.conf[18-26]
- Router/selenium-grid-router.conf[18-26]
## How to fix
Remove the background `selenium-grid-exporter &` lines from both start scripts and rely on supervisord to manage the exporter (recommended), or alternatively remove the supervisord exporter program blocks and keep the start-script launch (but then handle restarts/logging explicitly).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| BUILD_DATE := $(or $(BUILD_DATE),$(BUILD_DATE),20260404) | ||
| BASE_RELEASE := $(or $(BASE_RELEASE),$(BASE_RELEASE),selenium-4.43.0) | ||
| BASE_VERSION := $(or $(BASE_VERSION),$(BASE_VERSION),4.43.0) | ||
| BINDING_VERSION := $(or $(BINDING_VERSION),$(BINDING_VERSION),4.43.0) | ||
| BASE_RELEASE_NIGHTLY := $(or $(BASE_RELEASE_NIGHTLY),$(BASE_RELEASE_NIGHTLY),nightly) | ||
| BASE_VERSION_NIGHTLY := $(or $(BASE_VERSION_NIGHTLY),$(BASE_VERSION_NIGHTLY),4.45.0-SNAPSHOT) | ||
| VERSION := $(or $(VERSION),$(VERSION),4.44.0) | ||
| VERSION := $(or $(VERSION),$(VERSION),4.43.0) |
There was a problem hiding this comment.
3. Pinned build date/version 🐞 Bug ⚙ Maintainability
Makefile replaces a dynamic BUILD_DATE default with a hard-coded value and changes default Selenium versions from 4.44.0 to 4.43.0. This will unexpectedly alter image tags and build outputs for anyone using the default Makefile targets.
Agent Prompt
## Issue description
The Makefile hardcodes BUILD_DATE and downgrades the default Selenium version variables, which changes image tags and build behavior globally.
## Issue Context
This PR is about monitoring/exporter changes; these Makefile changes affect all image builds and releases.
## Fix Focus Areas
- Makefile[1-12]
## How to fix
Revert BUILD_DATE to default to `$(CURRENT_DATE)` (or another computed value), and restore the prior default BASE_RELEASE/BASE_VERSION/BINDING_VERSION/VERSION unless this PR explicitly intends to pin/downgrade. If pinning is intended, add documentation/comments and keep the previous defaults behind overrides.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
3fa7dba to
70e97c7
Compare
Signed-off-by: Viet Nguyen Duc <nguyenducviet4496@gmail.com>
70e97c7 to
2a7db3b
Compare
Thanks for contributing to the Docker-Selenium project!
A PR well described will help maintainers to quickly review and merge it
Before submitting your PR, please check our contributing guidelines, applied for this repository.
Avoid large PRs, help reviewers by making them as simple and short as possible.
Description
Motivation and Context
Types of changes
Checklist