Add PostgreSQL observability telemetry exposure via ServiceMonitors#1808
Add PostgreSQL observability telemetry exposure via ServiceMonitors#1808DmytroPI-dev wants to merge 5 commits intofeature/database-controllersfrom
Conversation
a1b796f to
976ecd1
Compare
| ); err != nil { | ||
| return ctrl.Result{}, err | ||
| } | ||
|
|
There was a problem hiding this comment.
It's another block for our reconciliation metric, maybe it's worth to emit event in case of success? or issue?
There was a problem hiding this comment.
also what about extending our status with information if this failed/succeeded i.e add new condition?
|
CLA Assistant Lite bot: I have read the CLA Document and I hereby sign the CLA 1 out of 2 committers have signed the CLA. |
| } | ||
|
|
||
| // PostgresObservabilityOverride overrides observability configuration options for PostgresClusterClass. | ||
| type PostgresObservabilityOverride struct { |
There was a problem hiding this comment.
PostgresObservabilityOverride we should follow the same pattern we have for ConnectionPoolerEnabled
So maybe ConnectionPoolerMetricsEnabled and PostgreSQLMetricsEnabled?
| PostgreSQL *FeatureDisableOverride `json:"postgresql,omitempty"` | ||
|
|
||
| // +optional | ||
| PgBouncer *FeatureDisableOverride `json:"pgbouncer,omitempty"` |
There was a problem hiding this comment.
in other provider we might not have pgbouncer ( aws for example) lets call it generic way ( connectionPooler). Also we should probably have CEL logic that doesnt allow connection pooler metrics enabled if connection pooler itself is disabled
| // Can be overridden in PostgresCluster CR. | ||
| // +kubebuilder:default={} | ||
| // +optional | ||
| Observability *PostgresObservabilityClassConfig `json:"observability,omitempty"` |
There was a problem hiding this comment.
Similar to previous comment :-)
| } | ||
|
|
||
| func isConnectionPoolerMetricsEnabled(cluster *enterprisev4.PostgresCluster, class *enterprisev4.PostgresClusterClass) bool { | ||
| if !isConnectionPoolerEnabled(cluster, class) { |
There was a problem hiding this comment.
this check shouldnt be a part of this function I believe
| return override == nil || !*override | ||
| } | ||
|
|
||
| func isConnectionPoolerEnabled(cluster *enterprisev4.PostgresCluster, class *enterprisev4.PostgresClusterClass) bool { |
There was a problem hiding this comment.
should this function be a part of connection pooler not monitoring?
| return override == nil || !*override | ||
| } | ||
|
|
||
| func buildPostgreSQLMetricsService(scheme *runtime.Scheme, cluster *enterprisev4.PostgresCluster) (*corev1.Service, error) { |
There was a problem hiding this comment.
Out of curiosity why we need to create k8s service to expose those information? Service is effectively a load balancer that use round robin. If we have many postgres instances every call to that endpoint can fetch metrics from different instance, which can be different depending how users are connected. Is my understanding correct?
| return fmt.Errorf("building PostgreSQL metrics Service: %w", err) | ||
| } | ||
|
|
||
| live := &corev1.Service{ |
There was a problem hiding this comment.
why do we need this, cant we use desired directly?
Description
Adds PostgreSQL observability telemetry exposure for
PostgresClusterwith operator-managed metricsServices and PrometheusServiceMonitors for PostgreSQL and PgBouncer.Key Changes
api/v4/postgresclusterclass_types.goAdded class-level observability configuration for PostgreSQL and PgBouncer metrics.
api/v4/postgrescluster_types.goAdded cluster-level disable-only observability overrides.
pkg/postgresql/cluster/core/cluster.goWired PostgreSQL and PgBouncer metrics
ServiceandServiceMonitorreconciliation into thePostgresClusterflow.Made
ServiceMonitorpresence required by failing reconciliation when the CRD is unavailable.pkg/postgresql/cluster/core/monitoring.goAdded feature resolution helpers.
Added builders and reconcilers for PostgreSQL/PgBouncer metrics
Services.Added builders and reconcilers for PostgreSQL/PgBouncer
ServiceMonitors.internal/controller/postgrescluster_controller.goAdded RBAC for
monitoring.coreos.com/servicemonitors.cmd/main.goRegistered Prometheus Operator
monitoring/v1types in the manager scheme.internal/controller/suite_test.goRegistered Prometheus Operator
monitoring/v1types in the test scheme.pkg/postgresql/cluster/core/monitoring_unit_test.goAdded unit tests for observability flag resolution and monitoring resource builders.
Testing and Verification
Added unit tests in
pkg/postgresql/cluster/core/monitoring_unit_test.gofor:ServicebuildersServiceMonitorbuildersRelated Issues
CPI-1853 - related JIRA ticket.
PR Checklist