Skip to content

htlcswitch: track on-chain held HTLCs#10895

Open
ziggie1984 wants to merge 3 commits into
lightningnetwork:masterfrom
ziggie1984:fix-onchain-interceptor-held-htlc
Open

htlcswitch: track on-chain held HTLCs#10895
ziggie1984 wants to merge 3 commits into
lightningnetwork:masterfrom
ziggie1984:fix-onchain-interceptor-held-htlc

Conversation

@ziggie1984

Copy link
Copy Markdown
Collaborator

Change Description

This fixes the forward interceptor handling for HTLCs that move from the
off-chain forwarding flow to the on-chain contractcourt flow after the
incoming channel force closes.

Previously, the held HTLC set stored every intercepted HTLC as a plain
InterceptedForward. That made off-chain and on-chain forwards share the
same auto-fail, resolve and release behavior, even though they have different
valid actions:

  • off-chain held HTLCs can be resumed, failed, settled and auto-failed;
  • on-chain held HTLCs can only be settled by delivering the preimage to the
    witness beacon.

That caused two problematic cases:

  1. After restart, an on-chain interceptor entry could be treated like a normal
    auto-failable forward and removed before the interceptor settled it.
  2. Without restart, the original off-chain held entry remained in memory. When
    contractcourt re-offered the same circuit on-chain, the held set treated it
    as a duplicate and kept the old off-chain backend. A later SETTLE then
    went to the link mailbox path instead of the witness beacon, so the incoming
    contest resolver never received the preimage.

This PR makes the held set track the source of each held HTLC:

  • offChainHeld preserves the existing off-chain behavior and auto-fails by
    failing back through the link;
  • onChainHeld is settle-only and expires by pruning local interceptor state;
  • an on-chain re-offer for a circuit that is currently held off-chain replaces
    the stored entry so future settlement reaches the witness beacon;
  • resolution now removes an entry only after the selected action succeeds.

The on-chain interceptor packet now also carries RefundTimeout as its
deadline. This keeps the existing public interceptor field populated while the
held set ensures only off-chain entries interpret that value as an auto-fail
height.

Steps to Test

go test ./htlcswitch -run HeldHtlcSet
go test -tags dev ./htlcswitch

New itest coverage was added for:

  • settling an on-chain interceptor HTLC after Bob restarts;
  • settling an HTLC that was first held off-chain and then moves on-chain
    without restarting Bob.

@ziggie1984 ziggie1984 self-assigned this Jun 11, 2026
@github-actions github-actions Bot added the severity-critical Requires expert review - security/consensus critical label Jun 11, 2026
@github-actions

Copy link
Copy Markdown

PR Severity: CRITICAL

Highest-severity file determines level | 4 non-test files | 467 lines changed

Critical (2 files):

  • htlcswitch/held_htlc_set.go - HTLC forwarding state machine
  • htlcswitch/interceptable_switch.go - HTLC forwarding/payment routing interceptable switch

Medium (1 file):

  • witness_beacon.go - Root-level Go file

Low (4 files - excluded from severity bump counts):

  • docs/release-notes/release-notes-0.21.1.md
  • htlcswitch/held_htlc_set_test.go
  • itest/list_on_test.go
  • itest/lnd_forward_interceptor_test.go

Analysis

This PR modifies core htlcswitch package files. Changes to held_htlc_set.go and interceptable_switch.go affect the HTLC forwarding and payment routing state machine, warranting expert review.

Bump check: 4 non-test files (threshold >20), 467 lines changed (threshold >500), 1 critical package. No bump applied.

To override, add a severity-override-{critical,high,medium,low} label.
<!-- pr-severity-bot -->

@ziggie1984 ziggie1984 added this to the v0.21.1 milestone Jun 11, 2026
@ziggie1984 ziggie1984 added this to v0.21 Jun 11, 2026
@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the HTLC interception mechanism to accurately manage HTLCs that transition from off-chain forwarding to on-chain resolution, particularly in scenarios involving channel force-closures. By introducing distinct handling for off-chain and on-chain held HTLCs, the changes prevent incorrect auto-failing or misdirection of settlement attempts, thereby enhancing the robustness and reliability of HTLC management within the system.

Highlights

  • Differentiated HTLC Holding: The heldHtlcSet now distinguishes between offChainHeld and onChainHeld entries, allowing for different behaviors based on whether an HTLC is in the off-chain forwarding flow or the on-chain contractcourt flow.
  • Correct On-Chain HTLC Resolution: On-chain held HTLCs are now settle-only and expire by pruning local interceptor state, preventing issues like premature auto-failing after a restart or incorrect routing of settlement messages to the link mailbox instead of the witness beacon.
  • Dynamic HTLC State Transition: If an HTLC is initially held off-chain and then re-offered on-chain (e.g., due to a channel force-close), the stored entry in the heldHtlcSet is replaced to ensure future settlement attempts correctly reach the witness beacon.
  • Robust Resolution Logic: HTLC resolution now removes an entry from the held set only after the selected action (settle, fail, resume) successfully completes, improving atomicity and reliability.
  • Integration Test Coverage: New integration tests have been added to cover critical scenarios, including settling an on-chain intercepted HTLC after a node restart and settling an HTLC that transitions from off-chain to on-chain without a restart.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the heldHtlcSet to distinguish between off-chain and on-chain held HTLCs using a new heldEntry interface, ensuring proper handling of resolutions and expirations for both flows. It also updates the InterceptableSwitch and witness_beacon to support this, and adds comprehensive unit and integration tests. A critical deadlock risk was identified in witness_beacon.go where SubscribeUpdates holds a lock while making a blocking call to the interceptor, which can conflict with the switch's event loop attempting to acquire the same lock during resolution; releasing the lock early is recommended.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread witness_beacon.go Outdated
Comment on lines +109 to +111
// Keep the on-chain intercept available to the interceptor until
// the HTLC expires on chain.
AutoFailHeight: int32(htlc.RefundTimeout),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Critical Deadlock Risk in SubscribeUpdates

There is a critical deadlock risk between the preimageBeacon locking and the InterceptableSwitch event loop.

Deadlock Scenario:

  1. Thread A (Contractcourt Resolver):

    • Calls preimageBeacon.SubscribeUpdates().
    • Acquires preimageBeacon.Lock() (line 66).
    • Reaches line 117 and calls p.interceptor(fwd) (which is InterceptableSwitch.ForwardPacket(fwd)).
    • ForwardPacket attempts to send to s.onchainIntercepted (line 455 of interceptable_switch.go). Since this channel is unbuffered, it blocks until the event loop receives from it.
  2. Thread B (InterceptableSwitch Event Loop - run() goroutine):

    • Currently processing a SETTLE resolution from s.resolutionChan (line 325 of interceptable_switch.go).
    • Calls s.resolve(res) -> heldHtlcSet.resolve(res) -> onChainHeld.resolve(res) -> fwd.Settle(res.Preimage).
    • fwd.Settle calls preimageBeacon.AddPreimages().
    • AddPreimages attempts to acquire preimageBeacon.Lock() (line 172 of witness_beacon.go).
    • Deadlock: Thread B is blocked waiting for preimageBeacon.Lock() (held by Thread A). Thread A is blocked waiting for Thread B to receive from s.onchainIntercepted.

Impact:

This deadlock will freeze the InterceptableSwitch event loop and the contractcourt resolver, preventing any further HTLC resolutions and potentially leading to a loss of funds if HTLCs expire on-chain without being swept.

Suggested Fix:

Release p.Lock() early in SubscribeUpdates as soon as the subscriber is added to the map, before calling p.interceptor(fwd).

func (p *preimageBeacon) SubscribeUpdates(
	chanID lnwire.ShortChannelID, htlc *channeldb.HTLC,
	payload *hop.Payload,
	nextHopOnionBlob []byte) (*contractcourt.WitnessSubscription, error) {

	p.Lock()
	clientID := p.clientCounter
	client := &preimageSubscriber{
		updateChan: make(chan lntypes.Preimage, 10),
		quit:       make(chan struct{}),
	}
	p.subscribers[p.clientCounter] = client
	p.clientCounter++
	p.Unlock() // Release lock early to avoid deadlocks during blocking interceptor calls

	srvrLog.Debugf("Creating new witness beacon subscriber, id=%v",
		clientID)

	sub := &contractcourt.WitnessSubscription{
		WitnessUpdates: client.updateChan,
		CancelSubscription: func() {
			p.Lock()
			defer p.Unlock()

			delete(p.subscribers, clientID)

			close(client.quit)
		},
	}
    // ... rest of the function ...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed good catch, was even a problem before this PR change

@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch 2 times, most recently from cf2e20c to 02347b0 Compare June 11, 2026 01:35
@ziggie1984

Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the heldHtlcSet to track held HTLCs as either off-chain or on-chain entries using a new heldEntry interface. This change ensures that on-chain re-offers can replace old off-chain holds, allowing settlements to reach the witness beacon after an incoming channel force-closes. Additionally, integration tests are added to verify on-chain settlement scenarios, and minor fixes are made to the witness beacon subscription flow. The reviewer's feedback correctly points out a style guide violation where the newHeldHtlcSet function lacks a documentation comment.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

set map[models.CircuitKey]heldEntry
}

func newHeldHtlcSet() *heldHtlcSet {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to the repository style guide, every function must be commented with its purpose and assumptions, starting with the function name as a complete sentence. Please add a comment for newHeldHtlcSet.

// newHeldHtlcSet returns a new instance of heldHtlcSet.
func newHeldHtlcSet() *heldHtlcSet {
References
  1. Every function must be commented with its purpose and assumptions. Function comments must begin with the function name and should be complete sentences. (link)

Add coverage for held forwards that move on chain after the
incoming channel force closes.

The restart case exercises the path where Bob loses the in-memory
held set and contractcourt re-offers the HTLC through the witness
beacon. The no-restart case keeps the original off-chain hold and
proves that settlement must still reach the on-chain resolver.
Store held forwards as off-chain or on-chain entries instead of a
raw InterceptedForward map. Off-chain entries keep the existing
resume, fail, settle and auto-fail behavior. On-chain entries are
settle-only and expire by pruning local interceptor state.

When contractcourt re-offers a circuit that is already held
off-chain, replace the stored entry with the on-chain forward so a
later SETTLE reaches the witness beacon instead of the old link
mailbox path.

Also set the on-chain interceptor deadline to the HTLC refund
timeout. This keeps the public interceptor deadline populated while
ensuring only off-chain held entries use that value to fail back.
Add the v0.21.1 release note skeleton and include the forward
interceptor on-chain settlement fix from this PR.
@ziggie1984 ziggie1984 force-pushed the fix-onchain-interceptor-held-htlc branch from 02347b0 to 6495243 Compare June 11, 2026 01:50
@saubyk saubyk removed this from v0.21 Jun 11, 2026
@saubyk saubyk added this to lnd v0.22 Jun 11, 2026
@saubyk saubyk removed this from the v0.21.1 milestone Jun 11, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in lnd v0.22 Jun 11, 2026
@saubyk saubyk added this to the v0.22.0 milestone Jun 11, 2026
@saubyk saubyk moved this from Backlog to In progress in lnd v0.22 Jun 11, 2026
@ziggie1984 ziggie1984 modified the milestones: v0.22.0, v0.21.1 Jun 11, 2026
@saubyk saubyk added this to v0.21 Jun 11, 2026
@saubyk saubyk removed this from lnd v0.22 Jun 11, 2026
@saubyk saubyk moved this to In progress in v0.21 Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix severity-critical Requires expert review - security/consensus critical

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

2 participants