feat(cloudflare): Instrument async KV API#19404
Conversation
Codecov Results 📊Generated by Codecov Action |
size-limit report 📦
|
04b7bbb to
0eb536e
Compare
| const currentClient = scope.getClient(); | ||
| if (!currentClient) { | ||
| const client = init({ ...wrapperOptions.options, ctx: context }); | ||
| const client = init({ ...wrapperOptions.options, ctx: context as unknown as ExecutionContext | undefined }); |
There was a problem hiding this comment.
Bug: The flush lock mechanism is ineffective for Durable Objects because it relies on waitUntil, which is a no-op in that context, potentially causing premature flushes.
Severity: MEDIUM
Suggested Fix
Detect if the context is a DurableObjectState and either use a different mechanism to await async operations (like blockConcurrencyWhile) or rely solely on the client's internal span tracking for flushing within Durable Objects. The current ineffective implementation should be corrected or documented as a known limitation.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: packages/cloudflare/src/wrapMethodWithSentry.ts#L73
Potential issue: The `makeFlushLock` mechanism is initialized with a context that, in
the case of a Cloudflare Durable Object, has a `waitUntil` method that is a no-op. The
lock relies on `waitUntil` to track the completion of asynchronous tasks. Since
`DurableObjectState.waitUntil` has no effect on the request lifetime, the flush lock
will not correctly wait for all async work to finish. This can lead to the `flush()`
operation resolving prematurely, potentially causing loss of telemetry data if the
Durable Object's process terminates before the SDK has finished sending all events.
Did we get this right? 👍 / 👎 to inform future reviews.
andreiborza
left a comment
There was a problem hiding this comment.
Overall LGTM, just one idea around caching.
| return function (this: unknown, ...args: unknown[]) { | ||
| return startSpan( | ||
| { | ||
| // Use underscore naming to match Cloudflare's native instrumentation (e.g., "durable_object_storage_get") | ||
| name: `durable_object_storage_${methodName}`, | ||
| op: 'db', | ||
| attributes: { | ||
| [SEMANTIC_ATTRIBUTE_SENTRY_ORIGIN]: 'auto.db.cloudflare.durable_object', | ||
| 'db.system.name': 'cloudflare.durable_object.storage', | ||
| 'db.operation.name': methodName, | ||
| }, | ||
| }, | ||
| () => { | ||
| return (original as (...args: unknown[]) => unknown).apply(target, args); | ||
| }, | ||
| ); | ||
| }; |
There was a problem hiding this comment.
What do you think about caching this function and retrieving from cache? The underlying storage functions should never really change, and I think especially in a KV-storage we should consider perf implications.
There was a problem hiding this comment.
I created a task for this: #19451
I want to go deeper a little before making this decision
closes #19384
closes JS-1744
With that we start to instrument DO objects starting with the Async KV API.
Cloudflare is instrumenting these with underlines between:
durable_object_storage_get, without any more information to it.In the future to make them a little more useful we could store the keys as span attributes on it with
db.cloudflare.durable_object.storage.keyordb.cloudflare.durable_object.storage.keys. First we have to add them to our semantic conventions though