Conversation
Serialize W3C trace context (traceparent, baggage) into a nullable JSONField on tasks at enqueue time, then extract and activate it as a parent span at execution time. This links task processor spans back to the originating HTTP request trace. - Add `trace_context` JSONField to AbstractBaseTask (migration 0014) - Inject via `propagate.inject()` in `TaskHandler.delay()` - Extract via `propagate.extract()` + span in `_run_task()` - Derive default service.name from entrypoint (flagsmith-task-processor) - Add `opentelemetry-api` to task-processor extras - Promote `span_exporter` fixture to root conftest Closes #184 beep boop
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #197 +/- ##
==========================================
+ Coverage 96.95% 97.05% +0.09%
==========================================
Files 101 102 +1
Lines 4008 4139 +131
==========================================
+ Hits 3886 4017 +131
Misses 122 122 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
emyller
left a comment
There was a problem hiding this comment.
Approving because it looks great, my comments are not blocking.
There is one coverage failure I'm suspicious about, but other tests add enough confidence.
| default_service_name = ( | ||
| "flagsmith-task-processor" | ||
| if "task-processor" in sys.argv | ||
| else "flagsmith-api" | ||
| ) |
There was a problem hiding this comment.
thought: Should we just set OTEL_SERVICE_NAME in container settings? I'm tempted to the smaller effort this allows, but it got my eyebrow raised.
| def test_ensure_cli_env__task_processor__expected_otel_service_name( | ||
| monkeypatch: pytest.MonkeyPatch, | ||
| mocker: MockerFixture, | ||
| ) -> None: | ||
| # Given | ||
| monkeypatch.setattr("sys.argv", ["flagsmith", "task-processor"]) | ||
| monkeypatch.setenv("OTEL_EXPORTER_OTLP_ENDPOINT", "http://collector:4318") | ||
|
|
||
| mock_build_log = mocker.patch( | ||
| "common.core.otel.build_otel_log_provider", | ||
| return_value=mocker.MagicMock(spec=LoggerProvider), | ||
| ) | ||
| mock_build_tracer = mocker.patch( | ||
| "common.core.otel.build_tracer_provider", | ||
| return_value=mocker.MagicMock(spec=TracerProvider), | ||
| ) | ||
| mocker.patch("common.core.otel.setup_tracing") | ||
|
|
||
| # When | ||
| with ensure_cli_env(): | ||
| pass | ||
|
|
||
| # Then | ||
| mock_build_log.assert_called_once_with( | ||
| endpoint="http://collector:4318/v1/logs", | ||
| service_name="flagsmith-task-processor", | ||
| ) | ||
| mock_build_tracer.assert_called_once_with( | ||
| endpoint="http://collector:4318/v1/traces", | ||
| service_name="flagsmith-task-processor", | ||
| ) |
There was a problem hiding this comment.
Not sure how the "flagsmith-api" case isn't caught by tests coverage.
Please, either:
- Fix coverage;
- Consider deleting
default_service_name, and this test — see here. IMOtest_ensure_cli_env__env_service_name__expected_otel_service_namelooks sufficient.
| task_spans = [s for s in spans if s.name == dummy_task.task_identifier] | ||
| assert len(task_spans) == 1 | ||
|
|
||
| task_span = task_spans[0] |
There was a problem hiding this comment.
| task_span = task_spans[0] | |
| task_span = task_spans[0] | |
| assert task_span.status.status_code == StatusCode.OK |
| assert len(task_spans) == 1 | ||
|
|
||
| task_span = task_spans[0] | ||
| assert task_span.attributes is not None |
There was a problem hiding this comment.
nit
| assert task_span.attributes is not None |
|
|
||
| task_span = task_spans[0] | ||
| assert task_span.status.status_code == StatusCode.ERROR | ||
| assert task_span.attributes is not None |
There was a problem hiding this comment.
nit
| assert task_span.attributes is not None |
| | --------------------------------- | --------------------------------------------------------------------------------------------------------------------- | --------------- | | ||
| | `OTEL_EXPORTER_OTLP_ENDPOINT` | Base OTLP endpoint (e.g. `http://collector:4318`). If unset, no OTel setup occurs. | _(disabled)_ | | ||
| | `OTEL_SERVICE_NAME` | The `service.name` resource attribute. | `flagsmith-api` | | ||
| | `OTEL_SERVICE_NAME` | The `service.name` resource attribute. Defaults to `flagsmith-task-processor` when running the task processor. | `flagsmith-api` | |
Closes #184.
trace_contextJSONField on tasks at enqueue time viapropagate.inject()inTaskHandler.delay()_run_task(), linking task execution spans back to the originating HTTP request traceservice.namefrom entrypoint:flagsmith-task-processorwhen running the task processor,flagsmith-apiotherwise (OTEL_SERVICE_NAMEoverrides both)opentelemetry-apitotask-processorextras,TraceContexttype alias, and span attributes matching Prometheus metric labels for cross-signal correlationHow to review
Review complexity: 2/5
Start with the model layer (
models.py,types.py, migration) — a newtrace_contextnullable JSONField. Then follow the write path indecorators.py. Next, the read path inprocessor.py. Finally,main.pyhas one conditional for the defaultservice.namebased onsys.argv.Tests mirror the above order: model, decorator, processor, main. The processor tests use static
traceparentstrings with known hex IDs for parametrisation.span_exporterfixture was promoted from integration test to root conftest, andRunTasksFixtureprotocol was updated with a defaultnum_tasks.Test plan
task_identifier,task_type,result) are presentservice.nameshowsflagsmith-task-processorfor task processor spans andflagsmith-apifor API spanstrace_id/span_id