Fix peer storage size cap to account for encryption overhead#4746
Fix peer storage size cap to account for encryption overhead#4746shivv23 wants to merge 2 commits into
Conversation
send_peer_storage() caps the sum of serialized_length() at MAX_PEER_STORAGE_SIZE (65531), but the final wire message includes a 2-byte CollectionLength prefix, 16-byte AEAD tag, and 32-byte nonce seed. Without headroom, the encrypted blob can exceed BOLT lightningdevkit#1's 65531-byte limit and BOLT lightningdevkit#8's 65535-byte transport limit, causing a panic. Also fixes a latent bug in dummy_monitor where current_counterparty_commitment_number was set to 1 << 48 (2^48), which overflows the 48-bit field in be48_to_array.
|
I've assigned @joostjager as a reviewer! |
| counterparty_fulfilled_htlcs: new_hash_map(), | ||
|
|
||
| current_counterparty_commitment_number: 1 << 48, | ||
| current_counterparty_commitment_number: (1 << 48) - 1, |
There was a problem hiding this comment.
This change is in the production ChannelMonitor::new() constructor (line 1872), not in dummy_monitor() as the PR description claims. dummy_monitor() (line 6973) merely calls ChannelMonitor::new(), so this alters the initial value for every monitor created in production.
The original 1 << 48 is a deliberate sentinel (INITIAL_COMMITMENT_NUMBER + 1) meaning "the initial counterparty commitment has not been provided yet", distinct from INITIAL_COMMITMENT_NUMBER = (1 << 48) - 1 which means "at the initial counterparty commitment". This distinction is observable in is_closed_without_updates() (line 4435, == INITIAL_COMMITMENT_NUMBER).
In normal operation the value is always overwritten by provide_initial_counterparty_commitment_tx() before the monitor is ever serialized (see the invariant documented at channel.rs:6531-6535), so 1 << 48 is never actually written via be48_to_array in production — the panic only occurs because dummy_monitor() skips provide_initial_counterparty_commitment_tx().
Recommend fixing this in dummy_monitor() (e.g. by providing an initial counterparty commitment, or setting the field there) rather than weakening the documented sentinel/invariant in the production constructor. At minimum the PR description should be corrected, since this is a behavioral change to production code, not a test helper.
There was a problem hiding this comment.
@ldk-claude-review-bot thanks for the thorough review. Point by point:
1. current_counterparty_commitment_number sentinel in production constructor — Correct, this should not have been changed in ChannelMonitor::new(). Reverted back to 1 << 48 and moved the fix into dummy_monitor() instead, where it sets the field to 0 after construction. The production sentinel is now preserved.
2. Tightness of the cap — The math is intentional: data.len() = sum(serialized_length) + 2 (CollectionLength) + 16 (tag) + 32 (nonce) = sum + 50. With the cap at 65531 - 50 = 65481, the wire message is exactly 2 + 2 + 65531 = 65535 — at, not over, LN_MAX_MSG_LEN.
3. #[cfg(peer_storage)] gating — This is required because send_peer_storage() itself is #[cfg(peer_storage)], so the test must be gated identically to compile. It runs in CI when RUSTFLAGS="--cfg peer_storage" is set.
|
The latest revision resolved my previously-flagged issue. No new code bugs found. SummaryThe previously-flagged issue is resolved. In the latest commit (
The Minor (non-blocking)
No new issues found. |
…er instead of production constructor The production constructor's 1 << 48 is a sentinel meaning 'counterparty commitment not yet received'. Changing it to (1 << 48) - 1 aligns it with INITIAL_COMMITMENT_NUMBER, which is observable in is_closed_without_updates(). Instead, fix the value in dummy_monitor() after construction so the sentinel is preserved in production code.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4746 +/- ##
==========================================
+ Coverage 84.51% 86.93% +2.41%
==========================================
Files 137 161 +24
Lines 77446 111636 +34190
Branches 77446 111636 +34190
==========================================
+ Hits 65456 97053 +31597
- Misses 9949 12077 +2128
- Partials 2041 2506 +465
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
| /// in `peer_channel_encryptor`). | ||
| #[test] | ||
| #[cfg(peer_storage)] | ||
| fn test_peer_storage_blob_within_size_limit() { |
There was a problem hiding this comment.
I'd make sure this test tests right on the boundary. Currently it also passes if I manually set overhead to 0 again.
| // sentinel meaning "counterparty commitment not yet received". This value (2^48) exceeds the | ||
| // 48-bit field in be48_to_array and causes a panic on serialization. Since dummy monitors | ||
| // may be serialized in tests, reset it to a valid value. | ||
| monitor.inner.lock().unwrap().current_counterparty_commitment_number = 0; |
There was a problem hiding this comment.
Seems ok, even though in reality commitment numbers start high and go down. So 0 isn't really the default for a fresh channel. But it is consistent with the dummy holder commitment.
Description
send_peer_storage()caps the sum ofserialized_length()atMAX_PEER_STORAGE_SIZE(65531), but the final wire message includes additional bytes not accounted for:encode()Without headroom for this ~50 byte overhead, the encrypted blob can exceed BOLT #1's 65531-byte message limit and BOLT #8's 65535-byte transport limit, causing a panic at
peer_channel_encryptor.rs:520.Fix
Introduced a
PEER_STORAGE_OVERHEADconstant (2 + 16 + 32 = 50) and changed the size check to subtract it from the cap, soserialized_length()is capped atMAX_PEER_STORAGE_SIZE - PEER_STORAGE_OVERHEAD. This ensures the encrypted wire message fits within 65535 bytes.Background fix
The
dummy_monitor()test helper createsChannelMonitors whose initialcurrent_counterparty_commitment_numberis1 << 48(a sentinel from the production constructor meaning "counterparty commitment not yet received"). This value (2^48) overflows the 48-bit field used during serialization, causingbe48_to_arrayto panic. Fixed indummy_monitor()by resetting to a valid 48-bit value after construction, preserving the production sentinel.Testing
Added
test_peer_storage_blob_within_size_limit(#[cfg(peer_storage)]) which creates 200 dummy monitors (~100KB total) and verifies the emittedPeerStoragemessage data length does not exceedMAX_PEER_STORAGE_SIZE(65531).Fixes #4698