Skip to content

[fix][broker] Don't let a stuck or aborted topic policies cache init make a namespace's topics unloadable#26025

Open
lhotari wants to merge 1 commit into
apache:masterfrom
lhotari:lh-policyCacheInitMap-timeout-fix
Open

[fix][broker] Don't let a stuck or aborted topic policies cache init make a namespace's topics unloadable#26025
lhotari wants to merge 1 commit into
apache:masterfrom
lhotari:lh-policyCacheInitMap-timeout-fix

Conversation

@lhotari

@lhotari lhotari commented Jun 13, 2026

Copy link
Copy Markdown
Member

Fixes #25294

Motivation

When a topic is loaded, the broker first waits for the namespace's topic policies cache to be
initialized. Initialization (SystemTopicBasedTopicPoliciesService#initPolicesCache) reads the
namespace's __change_events system topic to the end and completes a shared, per-namespace
future (policyCacheInitMap) that every topic load in the namespace awaits.

That shared future could be left pending forever, leaving every topic in the namespace stuck and
unloadable until the broker was restarted (issue #25294), in two distinct ways:

  1. No timeout on the read loop. If the __change_events reader reconnects but then stops making
    progress — e.g. after __change_events is unloaded/moved and the reconnected reader gets stuck
    (see the compacted-read stuck-reader bug fixed in [fix][broker] Fix compacted read could be stuck forever or message loss due to cursor mark delete #25998) — the read loop never finishes and the
    future stays pending. The 60s topicLoadTimeoutSeconds only fails the individual topic-load
    future; it does not clear the poisoned policyCacheInitMap entry or close the stuck reader, so the
    namespace stays poisoned and every later load times out the same way.
  2. Cleanup paths removed the future without completing it. Several paths drop the entry from
    policyCacheInitMap but never complete the future — most importantly the namespace-bundle unload
    path (removeOwnedNamespaceBundleAsync). They relied on the reader being closed and the init chain
    failing to complete the future indirectly; if that didn't happen, awaiting topic loads hung. This
    matches the "futures accumulate until the broker is restarted" symptom in the report.

This is defense-in-depth that is complementary to #25998 (which fixes one concrete stuck-reader
trigger on the broker/cursor side): it guarantees the per-namespace init future is always completed,
so a single stuck/aborted initialization can no longer take a whole namespace's topics down until
restart.

Modifications

  • Add topicPoliciesCacheInitTimeoutSeconds (default 60, dynamic). It bounds topic policies cache
    initialization for a namespace. Set to 0/negative to disable (previous unbounded behavior).
  • prepareInitPoliciesCacheAsync schedules a timeout that fails the init future. On timeout, an
    identity-guarded cleanup (cleanupAfterPolicyCacheInitTimeout) clears the cached state and
    closes the stuck reader only when the timed-out future is still the current one — it captures
    the reader before the gate and uses policyCacheInitMap.remove(ns, future) /
    readerCaches.remove(ns, reader), so a concurrent retry (or an unload that already replaced the
    entry) is never clobbered. A new pulsar.broker.topic.policies.cache.init.timeout.count
    OpenTelemetry counter records these events. The timeout task is cancelled as soon as initialization
    completes, so it adds no overhead on the normal path.
  • cleanPoliciesCacheInitMap and close() now complete any pending init future they drop
    (exceptionally), so awaiting topic loads fail fast and retry with a fresh reader instead of hanging.
    The completion is done outside the ConcurrentHashMap#compute remapping function, because
    completing the future can run the awaiting topic-load callbacks synchronously and doing that while
    holding the bin lock risks a recursive map update / deadlock (the hazard addressed in [Bug] [broker] Concurrent error in SystemTopicBasedTopicPoliciesService#prepareInitPoliciesCacheAsync #24977).

Verifying this change

This change added tests and can be verified as follows:

  • ...TopicPoliciesServiceTest#testPrepareInitPoliciesCacheAsyncTimesOutWhenReaderStuck: spies the
    __change_events reader so it reports more events but never delivers one (a stuck reader), then
    asserts prepareInitPoliciesCacheAsync fails with a TimeoutException instead of hanging, that the
    poisoned policyCacheInitMap entry is cleared, and that the stuck reader is closed. Verified red
    without the fix and green with it.
  • ...TopicPoliciesServiceTest#testCleanPoliciesCacheInitMapCompletesPendingInitFuture: asserts that
    dropping a pending init future (both the reader-close and non-reader-close branches) completes it
    exceptionally and removes it from the map, and that an already-completed future is left untouched.
  • The full SystemTopicBasedTopicPoliciesServiceTest suite passes, including the existing
    init/cleanup tests (cleanup call counts and behavior unchanged on the normal path).

Does this pull request potentially affect one of the following parts:

  • The default values of configurations (new topicPoliciesCacheInitTimeoutSeconds, default 60s; topic policies cache initialization is now bounded by default instead of unbounded)
  • The metrics (new counter pulsar.broker.topic.policies.cache.init.timeout.count)

@lhotari lhotari force-pushed the lh-policyCacheInitMap-timeout-fix branch 2 times, most recently from c94b39b to 93fa3ba Compare June 13, 2026 17:03
@lhotari lhotari changed the title [fix][broker] Time out topic policies cache initialization when the __change_events reader is stuck [fix][broker] Don't let a stuck or aborted topic policies cache init make a namespace's topics unloadable Jun 13, 2026
…make a namespace's topics unloadable

Topic loading waits for the namespace's topic policies cache to be initialized by reading the
__change_events system topic to the end (SystemTopicBasedTopicPoliciesService#initPolicesCache),
which completes a shared per-namespace future in policyCacheInitMap that every topic load awaits.
That future could be left pending forever, leaving every topic in the namespace stuck and
unloadable until the broker was restarted (issue apache#25294), in two ways:

1. The read loop had no timeout: a system-topic reader that reconnected but stopped making progress
   pinned the future indefinitely.
2. Several cleanup paths removed the future from policyCacheInitMap without ever completing it (most
   importantly the namespace-bundle unload path, removeOwnedNamespaceBundleAsync), relying on the
   reader being closed and the init chain failing to complete it indirectly.

Modifications:
- Add topicPoliciesCacheInitTimeoutSeconds (default 60s, dynamic). prepareInitPoliciesCacheAsync now
  schedules a timeout that fails the init future, and via an identity-guarded cleanup
  (cleanupAfterPolicyCacheInitTimeout) clears the cached state and closes the stuck reader only when
  the timed-out future is still the current one, so a concurrent retry/unload is never clobbered. A
  new metric pulsar.broker.topic.policies.cache.init.timeout.count counts these events.
- cleanPoliciesCacheInitMap and close() now complete any pending init future they drop
  (exceptionally), so awaiting topic loads fail fast and retry instead of hanging. Completion happens
  outside the ConcurrentHashMap compute() remapping to avoid a recursive update / deadlock (apache#24977).

Assisted-by: Claude Code (Opus 4.8)
@lhotari lhotari force-pushed the lh-policyCacheInitMap-timeout-fix branch from 93fa3ba to 33f4ddd Compare June 13, 2026 17:23
@lhotari lhotari requested a review from BewareMyPower June 13, 2026 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] topic unavailable because topic policy cache loading reader is stuck

2 participants