Skip to content

doubleKnockInternal() propagates transport-level TypeError from fetch() without wrapping or retry #762

@dahlia

Description

@dahlia

Summary

The fetch() call inside doubleKnockInternal (packages/fedify/src/sig/http.ts:1650) is unguarded, so any transport-level failure—TLS teardown errors, ECONNRESET, DNS hiccups—propagates out as a raw TypeError. It escapes doubleKnock, the authenticated document loader, key-owner fetching, and signature verification without ever being wrapped in FetchError or retried, even though the double-knock fallback exists precisely to handle flaky federation peers.

Reproduction in the wild

Hackers' Pub (Fedify 2.1.5 on Deno 2.7.4) sees this regularly from older Hubzilla/Friendica peers whose TLS stack closes connections without close_notify. Sample event:

TypeError: error sending request from 192.168.107.2:49112 for https://im.allmendenetz.de/channel/chris (178.18.246.19:443):
client error (SendRequest): connection error: peer closed connection without sending TLS close_notify:
https://docs.rs/rustls/latest/rustls/manual/_03_howto/index.html#unexpected-eof

    at fetch (ext:deno_fetch/26_fetch.js:475:11)
    at doubleKnockInternal (packages/fedify/src/sig/http.ts:1650:18)
    at doubleKnock (packages/fedify/src/sig/http.ts:1618:10)
    at load (packages/fedify/src/utils/docloader.ts:84:22)
    at CryptographicKey.#fetchOwner
    at CryptographicKey.getOwner
    at getKeyOwner (packages/fedify/src/sig/owner.ts:181:13)
    at RequestContextImpl.getSignedKeyOwner (packages/fedify/src/federation/middleware.ts:2758:37)

Breadcrumbs make it clear the failure is transient:

13:53:23  GET https://im.allmendenetz.de/channel/chris → 200
13:58:11  GET https://im.allmendenetz.de/channel/chris → TypeError (TLS close_notify)

Same URL, same signer, same actor; Deno's rustls-backed HTTP client is just strict about the unexpected-EOF case that older ActivityPub servers routinely produce.

Why the current code path leaks this

let response = await fetch(signedRequest, {
// Since Bun has a bug that ignores the `Request.redirect` option,
// to work around it we specify `redirect: "manual"` here too:
// https://github.com/oven-sh/bun/issues/10754
redirect: "manual",
signal,
});

No try/catch. The existing double-knock fallback only triggers on response-level failure (status 400/401/≥401), so transport-level errors bypass it entirely. The two recursive call sites (L1676, L1817) have the same shape. Downstream callers therefore receive an untyped TypeError instead of FetchError, which means:

  • They can't distinguish “the network blew up” from “the peer answered but rejected our signature.”
  • They can't safely degrade (e.g., treat the request as unverified, or skip the actor) without string-matching error messages.
  • A single TLS hiccup fails signature verification for the whole inbound request, even though a retry a few hundred milliseconds later would almost certainly succeed.

Proposed fix

  1. Wrap each fetch() in doubleKnockInternal so transport-level errors are rethrown as FetchError (preserving the original via Error.cause). This gives getDocumentLoader, getKeyOwner, and the middleware's signed-key-owner path a typed error to react to.

  2. Add a small bounded retry inside doubleKnockInternal for transport-level errors on idempotent requests (GET/HEAD)—single retry with short backoff is enough to absorb the close_notify/ECONNRESET class of flakes that dominate this report. Non-idempotent requests (signed POST to an inbox) should not be retried here; the queue handles those.## Environment

Environment

  • @fedify/fedify 2.1.5
  • Deno 2.7.4 (Linux aarch64)
  • Reproduced against im.allmendenetz.de (Hubzilla)

Metadata

Metadata

Assignees

Labels

component/signaturesOIP or HTTP/LD Signatures relatedtype/bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions