Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAdds per-sender persisted nonce tracking and per-address async locking; threads ID→key resolution and nonce/state caches through the message pool; tightens nonce-gap/RBF semantics; delegates signing+push (with balance checks) to a NonceTracker; wires new components into RPC/daemon/tools/tests; updates wallet signing, CLI output, tests, and CI job dependencies. Changes
Sequence DiagramsequenceDiagram
participant Client as Client
participant RPC as RPC_Handler
participant Locker as MpoolLocker
participant Tracker as NonceTracker
participant Mpool as MessagePool
participant Provider as Provider
participant Store as SettingsStore
Client->>RPC: MpoolPushMessage(message)
RPC->>Locker: take_lock(sender_address)
Locker-->>RPC: OwnedMutexGuard (locked)
RPC->>Tracker: sign_and_push(message, key, eth_chain_id)
Note right of Tracker: acquire global mutex (nonce serialization)
Tracker->>Mpool: get_sequence(resolve_to_key(sender), cur_ts)
Mpool->>Provider: resolve_to_key(sender, ts)
Provider-->>Mpool: canonical key address
Tracker->>Store: read /mpool/nonces/{addr}
Store-->>Tracker: persisted_nonce
Note right of Tracker: next_nonce = max(mpool_nonce, persisted_nonce)
Tracker->>Tracker: sign message with next_nonce
Tracker->>Mpool: push(SignedMessage)
Mpool-->>Tracker: push OK
Tracker->>Store: write /mpool/nonces/{addr} = next_nonce + 1
Tracker-->>RPC: SignedMessage
RPC-->>Client: result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)
146-160:⚠️ Potential issue | 🟠 MajorDon't bind
NonceTrackerto the shared snapshot-generation DB.
load_db()gives every snapshot run the sameArc<ReadOpsTrackingStore<ManyCar<ParityDb>>>, and this wrapper writes settings through toinner, nottracker. Withnonce_trackerpointed atstate_manager.blockstore_owned(), persisted nonce-cache updates will bleed into later snapshot generations and may still be absent from the exported minimal snapshot unless those keys get read again. This will make stateful fixtures order-dependent onceMpoolPushMessage/MpoolGetNoncesnapshots are added.Based on learnings, "when using ReadOpsTrackingStore to generate minimal snapshots, HEAD_KEY should be written to db.tracker (not db itself) before calling export_forest_car(), because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs` around lines 146 - 160, NonceTracker is being constructed against state_manager.blockstore_owned() which points at the full DB; instead bind the nonce cache to the ReadOpsTrackingStore tracker used for snapshot generation so persisted nonce writes go into the ephemeral tracker not the underlying DB. Change the NonceTracker creation to use the tracking store (the db.tracker returned by load_db() / the ReadOpsTrackingStore wrapper) rather than state_manager.blockstore_owned(), and ensure HEAD_KEY is written to db.tracker (tracker) before calling export_forest_car() so the export reads the tracked (accessed) keys only.src/message_pool/msgpool/mod.rs (1)
324-342:⚠️ Potential issue | 🟠 MajorCanonicalize the
rmsgskey as well.After this refactor
pendingis keyed by resolved key address, but Lines 333-342 still probermsgsby rawfrom. On a revert/apply where the same actor shows up asf0…on one side and key-address form on the other, the apply path misses the reverted entry, removes the pending entry instead, and then re-adds the reverted message at the end.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/mod.rs` around lines 324 - 342, The function remove_from_selected_msgs is checking rmsgs by the raw from address while pending is keyed by resolved key addresses; call resolve_to_key(api, key_cache, from, cur_ts) once at the start of the lookup and then use the resolved Address for both rmsgs and pending accesses (i.e., replace rmsgs.get_mut(from) with rmsgs.get_mut(&resolved) and use resolved in the remove(...) calls), ensuring you still propagate the Result error from resolve_to_key and keep the existing logic for removing the sequence from the temp map when present.
🧹 Nitpick comments (3)
scripts/tests/calibnet_wallet_check.sh (1)
76-76: Avoid scraping the humanlisttable forADDR_ONE.This script already had to change once because the
listlayout moved. Pulling the address out oftail | cutkeeps the test coupled to presentation-only output and will break again on the next formatting tweak. Prefer a command or output mode that returns just the address.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/tests/calibnet_wallet_check.sh` at line 76, The current extraction of ADDR_ONE by piping "$FOREST_WALLET_PATH list | tail -1 | cut -d ' ' -f2" scrapes the human-formatted table and is fragile; change the ADDR_ONE assignment to use a machine-readable/listing option or output mode from $FOREST_WALLET_PATH (e.g., a flag that emits just addresses or JSON) and parse that output (or use a --quiet/--format option) to reliably select the desired address, so the script uses the wallet CLI's non-presentational output instead of scraping the printed table.src/message_pool/mpool_locker.rs (1)
17-22: DocumentMpoolLocker::new().The public constructor is the only exposed
MpoolLockerAPI here without rustdoc.As per coding guidelines, "Document public functions and structs with doc comments".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/mpool_locker.rs` around lines 17 - 22, Add a doc comment for the public constructor MpoolLocker::new() describing what a MpoolLocker represents and what the new() instance contains/initializes (e.g., an empty inner Mutex<HashMap> used to track per-message-pool locks), include any thread-safety or ownership notes relevant to callers, and place the triple-slash /// doc immediately above the impl block or the new() function so rustdoc will document it alongside the MpoolLocker API.src/message_pool/msgpool/test_provider.rs (1)
229-244: Consider adding BLS message support for completeness.The
messages_for_tipsetimplementation only collects signed messages. While this is consistent withTestApi's internal storage (which only storesSignedMessage), the productionChainStore::messages_for_tipsetcollects both unsigned BLS and signed SECP messages viaBlockMessages::for_tipset.This discrepancy is acceptable for current tests since they primarily use SECP messages, but if future tests require BLS message handling during head changes, this may need enhancement.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/test_provider.rs` around lines 229 - 244, messages_for_tipset currently only returns signed messages (using inner.bmsgs and ChainMessage::Signed), so add support for BLS/unsigned messages by extending TestApi's storage and collecting them in the same loop: add a container for BLS/unsigned messages (e.g., inner.bls_msgs or similar) and in messages_for_tipset iterate blocks' CIDs to append both signed messages (ChainMessage::Signed(Arc::new(...))) and unsigned/BLS messages (ChainMessage::Unsigned or the appropriate ChainMessage variant for BLS) into msgs before returning; update any TestApi methods that populate messages so tests can insert BLS messages into the new storage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Line 40: The changelog entry currently references the PR number; update the
entry in CHANGELOG.md so it links to the issue `#4899` instead of PR `#6788`
(replace the PR link "[`#6788`](...)" with the issue link "[`#4899`](...)" and keep
the description "Fixed message pool nonce calculation to align with Lotus.") to
follow the repo convention of preferring issue references when both issue and PR
exist.
In `@scripts/tests/calibnet_delegated_wallet_check.sh`:
- Around line 87-100: The loop that polls DELEGATE_ADDR_REMOTE_THREE_BALANCE can
false-pass because it only compares to the prior observed balance and doesn’t
ensure MSG_DELEGATE_FOUR actually landed; change the logic in the block using
MSG_DELEGATE_FOUR, DELEGATE_ADDR_REMOTE_THREE_BALANCE and
DELEGATE_ADDR_THREE_BALANCE to either (A) wait for the message CID in
MSG_DELEGATE_FOUR to be included/confirmed before returning (preferred), or (B)
if keeping balance-polling, fail the script when the loop reaches the 20 retry
timeout instead of continuing silently (exit non-zero and log an error), and
keep updating DELEGATE_ADDR_REMOTE_THREE_BALANCE via $FOREST_WALLET_PATH
--remote-wallet balance each iteration; ensure the chosen approach references
MSG_DELEGATE_FOUR for waiting or uses an explicit exit on timeout.
In `@scripts/tests/calibnet_wallet_check.sh`:
- Around line 157-181: The loops currently only compare balances
(ETH_ADDR_TWO_BALANCE / ETH_ADDR_THREE_BALANCE) and will silently succeed after
retries even if the send commands (MSG_ETH, MSG_ETH_REMOTE) never produced a
message CID; change the logic to wait on the actual returned message CIDs from
MSG_ETH and MSG_ETH_REMOTE (extract the CID from MSG_ETH / MSG_ETH_REMOTE) and
poll the node/wallet for that CID confirmation, or at minimum make the 20-retry
timeout fatal by exiting non-zero if the CID was not observed; update the two
loops that reference ETH_ADDR_TWO_BALANCE and ETH_ADDR_THREE_BALANCE to use the
extracted CID variables and fail with exit 1 on timeout so the script cannot
pass silently.
In `@src/message_pool/mpool_locker.rs`:
- Around line 52-74: Tests use timing sleeps which makes them flaky; replace the
sleep-based handshakes in the two test blocks (the tasks spawned as t1/t2 that
call locker.take_lock and use first_entered/first_released/second_saw_first)
with a deterministic sync primitive such as tokio::sync::Notify or
tokio::sync::Barrier: have the first task signal (notify.notify_one() or
barrier.wait().await) immediately after it acquires the lock (where it currently
sets entered.store) and have the second task await that signal before attempting
its assertion (instead of sleeping), and likewise use a notify or barrier to
signal release instead of the 20ms/100ms sleeps; apply the same change to the
other test block referenced (the one around lines 94-116) so both tests no
longer rely on timing.
In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 307-315: The code masks resolve_to_key failures by using
unwrap_or(msg.from()), which can undercount nonces; change the scan to propagate
resolution errors instead: call resolve_to_key(api, key_cache, &msg.from(),
cur_ts)? (or otherwise handle the Result and return Err) instead of unwrap_or,
and adjust the surrounding function's signature/return path to propagate the
error; refer to resolve_to_key, messages_for_tipset, msg.from(), and next_nonce
when making this change so the failure during the current tipset scan is not
silently converted to the original address.
In `@src/message_pool/nonce_tracker.rs`:
- Around line 81-87: The code currently pushes the signed message with
mpool.push(smsg.clone()).await? and then calls self.save_nonce(&message.from,
nonce)? which can return an error and cause MpoolPushMessage to report failure
despite the message already being in the mempool; change this to make
persistence best-effort: call self.save_nonce(&message.from, nonce) but do not
propagate its error — instead catch/log the error (e.g. with error!/warn!) and
still return Ok(smsg). Keep the same call order (sign_message, mpool.push,
save_nonce) but ensure save_nonce failures do not convert the whole operation
into an error.
In `@src/rpc/methods/mpool.rs`:
- Around line 301-306: MpoolGetNonce is reading the nonce directly from mpool
via mpool.get_sequence while WalletSignMessage/MpoolPush use
ctx.nonce_tracker.sign_and_push, causing mismatch after restarts; change
MpoolGetNonce to query the same NonceTracker (e.g., call the nonce-tracking
getter on ctx.nonce_tracker instead of mpool.get_sequence) so both flows use the
same source, and ensure any other code paths that return sequence numbers
(mpool.get_sequence, MpoolPushMessage) are routed through NonceTracker APIs to
keep nonce state consistent.
---
Outside diff comments:
In `@src/message_pool/msgpool/mod.rs`:
- Around line 324-342: The function remove_from_selected_msgs is checking rmsgs
by the raw from address while pending is keyed by resolved key addresses; call
resolve_to_key(api, key_cache, from, cur_ts) once at the start of the lookup and
then use the resolved Address for both rmsgs and pending accesses (i.e., replace
rmsgs.get_mut(from) with rmsgs.get_mut(&resolved) and use resolved in the
remove(...) calls), ensuring you still propagate the Result error from
resolve_to_key and keep the existing logic for removing the sequence from the
temp map when present.
In `@src/tool/subcommands/api_cmd/generate_test_snapshot.rs`:
- Around line 146-160: NonceTracker is being constructed against
state_manager.blockstore_owned() which points at the full DB; instead bind the
nonce cache to the ReadOpsTrackingStore tracker used for snapshot generation so
persisted nonce writes go into the ephemeral tracker not the underlying DB.
Change the NonceTracker creation to use the tracking store (the db.tracker
returned by load_db() / the ReadOpsTrackingStore wrapper) rather than
state_manager.blockstore_owned(), and ensure HEAD_KEY is written to db.tracker
(tracker) before calling export_forest_car() so the export reads the tracked
(accessed) keys only.
---
Nitpick comments:
In `@scripts/tests/calibnet_wallet_check.sh`:
- Line 76: The current extraction of ADDR_ONE by piping "$FOREST_WALLET_PATH
list | tail -1 | cut -d ' ' -f2" scrapes the human-formatted table and is
fragile; change the ADDR_ONE assignment to use a machine-readable/listing option
or output mode from $FOREST_WALLET_PATH (e.g., a flag that emits just addresses
or JSON) and parse that output (or use a --quiet/--format option) to reliably
select the desired address, so the script uses the wallet CLI's
non-presentational output instead of scraping the printed table.
In `@src/message_pool/mpool_locker.rs`:
- Around line 17-22: Add a doc comment for the public constructor
MpoolLocker::new() describing what a MpoolLocker represents and what the new()
instance contains/initializes (e.g., an empty inner Mutex<HashMap> used to track
per-message-pool locks), include any thread-safety or ownership notes relevant
to callers, and place the triple-slash /// doc immediately above the impl block
or the new() function so rustdoc will document it alongside the MpoolLocker API.
In `@src/message_pool/msgpool/test_provider.rs`:
- Around line 229-244: messages_for_tipset currently only returns signed
messages (using inner.bmsgs and ChainMessage::Signed), so add support for
BLS/unsigned messages by extending TestApi's storage and collecting them in the
same loop: add a container for BLS/unsigned messages (e.g., inner.bls_msgs or
similar) and in messages_for_tipset iterate blocks' CIDs to append both signed
messages (ChainMessage::Signed(Arc::new(...))) and unsigned/BLS messages
(ChainMessage::Unsigned or the appropriate ChainMessage variant for BLS) into
msgs before returning; update any TestApi methods that populate messages so
tests can insert BLS messages into the new storage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9c53ddd6-e120-4024-a80f-227d8c503f0b
📒 Files selected for processing (24)
.github/workflows/forest.ymlCHANGELOG.mddocs/docs/developers/guides/nonce_handling.mdscripts/tests/calibnet_delegated_wallet_check.shscripts/tests/calibnet_wallet_check.shsrc/daemon/mod.rssrc/key_management/wallet_helpers.rssrc/message_pool/errors.rssrc/message_pool/mod.rssrc/message_pool/mpool_locker.rssrc/message_pool/msgpool/mod.rssrc/message_pool/msgpool/msg_pool.rssrc/message_pool/msgpool/provider.rssrc/message_pool/msgpool/selection.rssrc/message_pool/msgpool/test_provider.rssrc/message_pool/nonce_tracker.rssrc/rpc/methods/mpool.rssrc/rpc/methods/sync.rssrc/rpc/methods/wallet.rssrc/rpc/mod.rssrc/tool/offline_server/server.rssrc/tool/subcommands/api_cmd/generate_test_snapshot.rssrc/tool/subcommands/api_cmd/test_snapshot.rssrc/wallet/subcommands/wallet_cmd.rs
💤 Files with no reviewable changes (1)
- .github/workflows/forest.yml
updated PR description |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/message_pool/mpool_locker.rs (1)
26-32: Consider derivingDefaultalongsidenew().
clippy::new_without_defaultsuggests implementingDefaultwhen a type has anew()constructor taking no arguments.♻️ Proposed fix
+impl Default for MpoolLocker { + fn default() -> Self { + Self::new() + } +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/mpool_locker.rs` around lines 26 - 32, Implement Default for MpoolLocker to satisfy clippy::new_without_default: add an impl Default for MpoolLocker { fn default() -> Self { Self::new() } } (or derive Default on the struct if you prefer and the inner types support it) so that MpoolLocker::default() is equivalent to MpoolLocker::new(); reference symbols: MpoolLocker and its new() constructor and the Default trait.src/message_pool/msgpool/mod.rs (2)
267-292: Consider hoisting the tipset read outside the inner loop.
cur_tipset.read().clone()is called for every message in the loop. While not incorrect, this acquires and releases the read lock repeatedly. Hoisting the read outside the block iteration would be slightly more efficient.Proposed improvement
for b in ts.block_headers() { let Ok((msgs, smsgs)) = api.messages_for_block(b) else { tracing::error!("error retrieving messages for block"); continue; }; + let cur_ts = cur_tipset.read().clone(); for msg in smsgs { - let cur_ts = cur_tipset.read().clone(); remove_from_selected_msgs( api, key_cache, &cur_ts, // ... rest unchanged🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/mod.rs` around lines 267 - 292, The loops call cur_tipset.read().clone() for every message, causing repeated read-lock acquires; hoist the tipset snapshot outside the inner loops by reading and cloning cur_tipset once (e.g., let cur_ts = cur_tipset.read().clone();) before iterating over smsgs and msgs, then pass that cur_ts into remove_from_selected_msgs; update both loops that call remove_from_selected_msgs (the one iterating smsgs and the one iterating msgs) to use the single precomputed cur_ts and keep existing logic around repub/republished and rmsgs unchanged.
361-368: Consider usingentryAPI to avoid theunwrap().While the
unwrap()on line 366 cannot actually fail (since the key was just inserted on line 365), it would be cleaner to use the entry API to avoid the pattern entirely. As per coding guidelines,unwrap()should be avoided in production code.Cleaner entry-based approach
pub(in crate::message_pool) fn add_to_selected_msgs( m: SignedMessage, rmsgs: &mut HashMap<Address, HashMap<u64, SignedMessage>>, ) { - let s = rmsgs.get_mut(&m.from()); - if let Some(s) = s { - s.insert(m.sequence(), m); - } else { - rmsgs.insert(m.from(), HashMap::new()); - rmsgs.get_mut(&m.from()).unwrap().insert(m.sequence(), m); - } + rmsgs + .entry(m.from()) + .or_insert_with(HashMap::new) + .insert(m.sequence(), m); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/mod.rs` around lines 361 - 368, Replace the manual get_mut/insert/unwrap pattern on rmsgs with the HashMap entry API: use rmsgs.entry(m.from()).or_insert_with(HashMap::new) (or similar) to obtain a mutable inner map and then insert m using m.sequence() as the key; update the block around rmsgs, m.from(), and m.sequence() to avoid any unwrap() calls and make the insertion atomic via entry().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/forest.yml:
- Around line 280-285: The umbrella workflow job integration-tests-status is
missing the newly re-enabled calibnet-delegated-wallet-check in its needs list;
update the integration-tests-status job definition to include
"calibnet-delegated-wallet-check" in its needs array so the umbrella job waits
for and reflects the delegated wallet check result (ensure the job name matches
exactly "calibnet-delegated-wallet-check" when adding it to
integration-tests-status's needs).
In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 323-337: The current combined if-let on resolve_to_key and
api.messages_for_tipset silently skips the tipset scan on any error; change this
to explicitly handle errors from resolve_to_key(api, key_cache, addr, cur_ts)
and api.messages_for_tipset(cur_ts) (e.g. use match or the ? operator) so
failures are returned/propagated instead of ignored; then perform the loop over
messages (examining msg.from(), resolve_to_key for each sender, comparing to the
resolved sender, and updating next_nonce from msg.sequence()+1) only when both
calls succeed, ensuring caller sees the error rather than silently undercounting
nonces.
---
Nitpick comments:
In `@src/message_pool/mpool_locker.rs`:
- Around line 26-32: Implement Default for MpoolLocker to satisfy
clippy::new_without_default: add an impl Default for MpoolLocker { fn default()
-> Self { Self::new() } } (or derive Default on the struct if you prefer and the
inner types support it) so that MpoolLocker::default() is equivalent to
MpoolLocker::new(); reference symbols: MpoolLocker and its new() constructor and
the Default trait.
In `@src/message_pool/msgpool/mod.rs`:
- Around line 267-292: The loops call cur_tipset.read().clone() for every
message, causing repeated read-lock acquires; hoist the tipset snapshot outside
the inner loops by reading and cloning cur_tipset once (e.g., let cur_ts =
cur_tipset.read().clone();) before iterating over smsgs and msgs, then pass that
cur_ts into remove_from_selected_msgs; update both loops that call
remove_from_selected_msgs (the one iterating smsgs and the one iterating msgs)
to use the single precomputed cur_ts and keep existing logic around
repub/republished and rmsgs unchanged.
- Around line 361-368: Replace the manual get_mut/insert/unwrap pattern on rmsgs
with the HashMap entry API: use
rmsgs.entry(m.from()).or_insert_with(HashMap::new) (or similar) to obtain a
mutable inner map and then insert m using m.sequence() as the key; update the
block around rmsgs, m.from(), and m.sequence() to avoid any unwrap() calls and
make the insertion atomic via entry().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 99615cba-8153-4242-a701-3f306c25e967
📒 Files selected for processing (9)
.github/workflows/forest.ymlCHANGELOG.mdscripts/tests/calibnet_delegated_wallet_check.shsrc/message_pool/mpool_locker.rssrc/message_pool/msgpool/mod.rssrc/message_pool/msgpool/msg_pool.rssrc/message_pool/msgpool/selection.rssrc/message_pool/nonce_tracker.rssrc/tool/subcommands/api_cmd/generate_test_snapshot.rs
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (4)
- src/tool/subcommands/api_cmd/generate_test_snapshot.rs
- scripts/tests/calibnet_delegated_wallet_check.sh
- src/message_pool/msgpool/selection.rs
- src/message_pool/nonce_tracker.rs
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/message_pool/msgpool/mod.rs (2)
260-291:⚠️ Potential issue | 🟠 MajorResolve apply-side senders against
ts, notcur_ts.
remove_from_selected_msgsnow depends onProvider::resolve_to_key, and that provider resolves fromts.parent_state()(src/message_pool/msgpool/provider.rs:104-110). Passing the previous head here makes apply-side ID resolution one tipset stale. A sender that only becomes resolvable incur_tswill makehead_changefail when its message lands ints.Suggested fix
- let cur_ts = cur_tipset.read().clone(); for b in ts.block_headers() { let Ok((msgs, smsgs)) = api.messages_for_block(b) else { tracing::error!("error retrieving messages for block"); continue; }; for msg in smsgs { remove_from_selected_msgs( api, key_cache, - &cur_ts, + &ts, &msg.from(), pending, msg.sequence(), rmsgs.borrow_mut(), )?; if !repub && republished.write().insert(msg.cid()) { repub = true; } } for msg in msgs { remove_from_selected_msgs( api, key_cache, - &cur_ts, + &ts, &msg.from, pending, msg.sequence, rmsgs.borrow_mut(), )?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/mod.rs` around lines 260 - 291, The apply-side sender resolution is using the stale cur_ts instead of the current tipset ts, causing Provider::resolve_to_key to resolve against the wrong state; update the two calls to remove_from_selected_msgs inside the apply loop to pass the current ts (the tipset from the for ts in apply loop) rather than cur_ts so resolution uses ts.parent_state() as intended (affecting both the smsgs and msgs remove_from_selected_msgs invocations).
255-257:⚠️ Potential issue | 🟠 MajorNormalize
rmsgsto the resolved sender too.
pendingis now keyed by key-form senders, butrmsgsis still populated and queried with rawfromvalues. A revert/apply pair for the same actor using different address forms (ID vs key) won't cancel out: the apply path missesrmsgs, removes the pending entry, and the reverted message gets re-added at the end ofhead_change.Suggested fix
- for msg in msgs { - add_to_selected_msgs(msg, rmsgs.borrow_mut()); + for msg in msgs { + let resolved_from = resolve_to_key(api, key_cache, &msg.from(), &ts)?; + add_to_selected_msgs(resolved_from, msg, rmsgs.borrow_mut()); } @@ pub(in crate::message_pool) fn remove_from_selected_msgs<T: Provider>( api: &T, key_cache: &SizeTrackingLruCache<Address, Address>, cur_ts: &Tipset, from: &Address, @@ ) -> Result<(), Error> { - if let Some(temp) = rmsgs.get_mut(from) { - if temp.get_mut(&sequence).is_some() { - temp.remove(&sequence); - } else { - let resolved = resolve_to_key(api, key_cache, from, cur_ts)?; - remove(&resolved, pending, sequence, true)?; - } + let resolved = resolve_to_key(api, key_cache, from, cur_ts)?; + if let Some(temp) = rmsgs.get_mut(&resolved) { + if temp.remove(&sequence).is_none() { + remove(&resolved, pending, sequence, true)?; + } } else { - let resolved = resolve_to_key(api, key_cache, from, cur_ts)?; remove(&resolved, pending, sequence, true)?; } Ok(()) } @@ -pub(in crate::message_pool) fn add_to_selected_msgs( - m: SignedMessage, +pub(in crate::message_pool) fn add_to_selected_msgs( + from: Address, + m: SignedMessage, rmsgs: &mut HashMap<Address, HashMap<u64, SignedMessage>>, ) { - rmsgs.entry(m.from()).or_default().insert(m.sequence(), m); + rmsgs.entry(from).or_default().insert(m.sequence(), m); }Also applies to: 331-349, 356-360
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/mod.rs` around lines 255 - 257, rmsgs is being populated/queried using raw msg.from while pending uses key-form senders; normalize rmsgs to the same resolved/key-form sender so revert/apply pairs cancel correctly: when calling add_to_selected_msgs and anywhere head_change manipulates rmsgs (the loops over msgs and the other blocks noted around add_to_selected_msgs), resolve/convert msg.from to the key-form (the same form used as keys in pending) before inserting, removing, or querying rmsgs; update all occurrences that access rmsgs (including where you re-add or remove entries) to use the resolved sender key so lookups match pending’s keys.
🧹 Nitpick comments (1)
src/message_pool/msgpool/msg_pool.rs (1)
311-335: Key the nonce cache by the resolved sender.
get_state_sequencecompares tipset messages againstresolved, butStateNonceCacheKey.addris still the caller-supplied address. The same actor queried as ID and key will miss the cache twice and rescan the tipset twice, which undercuts the new cache's purpose.Suggested fix
- let nk = StateNonceCacheKey { - tipset_key: cur_ts.key().clone(), - addr: *addr, - }; - - if let Some(cached) = state_nonce_cache.get_cloned(&nk) { - return Ok(cached); - } - - let actor = api.get_actor_after(addr, cur_ts)?; - let mut next_nonce = actor.sequence; - let resolved = resolve_to_key(api, key_cache, addr, cur_ts)?; + let nk = StateNonceCacheKey { + tipset_key: cur_ts.key().clone(), + addr: resolved, + }; + + if let Some(cached) = state_nonce_cache.get_cloned(&nk) { + return Ok(cached); + } + + let actor = api.get_actor_after(addr, cur_ts)?; + let mut next_nonce = actor.sequence; let messages = api.messages_for_tipset(cur_ts)?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/msg_pool.rs` around lines 311 - 335, Construct the StateNonceCacheKey using the resolved sender address instead of the original caller-supplied addr: call resolve_to_key(api, key_cache, addr, cur_ts) before building nk, use the returned resolved value for nk.addr (and for the subsequent cache get_cloned check), and also use that same resolved address when calling state_nonce_cache.push(nk, next_nonce); ensure types match (clone/borrow as needed) so both the lookup and insertion key the cache by the resolved sender.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/message_pool/msgpool/mod.rs`:
- Around line 260-291: The apply-side sender resolution is using the stale
cur_ts instead of the current tipset ts, causing Provider::resolve_to_key to
resolve against the wrong state; update the two calls to
remove_from_selected_msgs inside the apply loop to pass the current ts (the
tipset from the for ts in apply loop) rather than cur_ts so resolution uses
ts.parent_state() as intended (affecting both the smsgs and msgs
remove_from_selected_msgs invocations).
- Around line 255-257: rmsgs is being populated/queried using raw msg.from while
pending uses key-form senders; normalize rmsgs to the same resolved/key-form
sender so revert/apply pairs cancel correctly: when calling add_to_selected_msgs
and anywhere head_change manipulates rmsgs (the loops over msgs and the other
blocks noted around add_to_selected_msgs), resolve/convert msg.from to the
key-form (the same form used as keys in pending) before inserting, removing, or
querying rmsgs; update all occurrences that access rmsgs (including where you
re-add or remove entries) to use the resolved sender key so lookups match
pending’s keys.
---
Nitpick comments:
In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 311-335: Construct the StateNonceCacheKey using the resolved
sender address instead of the original caller-supplied addr: call
resolve_to_key(api, key_cache, addr, cur_ts) before building nk, use the
returned resolved value for nk.addr (and for the subsequent cache get_cloned
check), and also use that same resolved address when calling
state_nonce_cache.push(nk, next_nonce); ensure types match (clone/borrow as
needed) so both the lookup and insertion key the cache by the resolved sender.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 28391af8-d2e3-47e0-a50d-14a5add16dba
📒 Files selected for processing (3)
.github/workflows/forest.ymlsrc/message_pool/msgpool/mod.rssrc/message_pool/msgpool/msg_pool.rs
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/message_pool/msgpool/msg_pool.rs (1)
320-339:⚠️ Potential issue | 🟠 MajorDon't cache a nonce after swallowing tipset-scan failures.
This still turns
resolve_to_key()/messages_for_tipset()failures into a silent fallback toactor.sequence, then stores that partial result instate_nonce_cache. A transient lookup error now undercounts the nonce and keeps returning the stale value until the cache entry is evicted.💡 Suggested fix
- if let (Ok(resolved), Ok(messages)) = ( - resolve_to_key(api, key_cache, addr, cur_ts), - api.messages_for_tipset(cur_ts), - ) { - for msg in &messages { - if let Ok(from) = resolve_to_key(api, key_cache, &msg.from(), cur_ts) - && from == resolved - { - let n = msg.sequence() + 1; - if n > next_nonce { - next_nonce = n; - } - } - } - } + let resolved = resolve_to_key(api, key_cache, addr, cur_ts)?; + let messages = api.messages_for_tipset(cur_ts)?; + for msg in &messages { + let from = resolve_to_key(api, key_cache, &msg.from(), cur_ts)?; + if from == resolved { + let n = msg.sequence() + 1; + if n > next_nonce { + next_nonce = n; + } + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/message_pool/msgpool/msg_pool.rs` around lines 320 - 339, The code currently falls back to actor.sequence and writes that to state_nonce_cache even if resolve_to_key() or api.messages_for_tipset() failed; change the logic so we only update state_nonce_cache (state_nonce_cache.push(nk, next_nonce)) when both resolve_to_key(...) and api.messages_for_tipset(...) returned Ok. Specifically, keep the initial actor = api.get_actor_after(...) and next_nonce initialization, but move the state_nonce_cache.push call inside the if let (Ok(resolved), Ok(messages)) = (...) { ... } block (or otherwise gate it on both successes) so that failures from resolve_to_key or messages_for_tipset are not silently cached.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/message_pool/msgpool/mod.rs`:
- Around line 255-257: The code resolves msg.from() to a key for each reverted
message and stores it via add_to_selected_msgs into rmsgs, but later
get_state_sequence and the pending insertion path re-resolve msg.from() against
the new head (lines ~305-320), potentially mapping ID senders to the wrong actor
after a reorg; fix by threading the previously-resolved sender key from rmsgs
through get_state_sequence and into the pending-insert logic instead of calling
resolve_to_key / re-resolving msg.from() again—use the existing resolved key
stored in rmsgs when constructing pending inserts and when calling
get_state_sequence so the reverted-branch sender identity is preserved
(referencing resolve_to_key, add_to_selected_msgs, rmsgs, get_state_sequence,
and pending insertion logic).
In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 286-300: The cache is not tipset-aware: resolve_to_key uses cur_ts
but key_cache (SizeTrackingLruCache<Address, Address>) keys only by Address,
allowing stale ID->key mappings across reorgs; update the caching so the key
includes the tipset identity (e.g., use (TipsetKey, Address) as the cache key)
or ensure the cache is explicitly flushed when the head changes. Specifically,
modify resolve_to_key to look up and store entries using a composite key built
from cur_ts.key() (or TipsetKey) and addr, update the SizeTrackingLruCache
type/usage accordingly (or add a clear_on_head_change hook where the mempool
updates head), and keep the existing api.resolve_to_key(addr, cur_ts) call for
actual resolution.
In `@src/message_pool/msgpool/selection.rs`:
- Around line 843-851: The loop over left_chain currently only replays the
signed messages (smsgs) returned by api.messages_for_block(block) and thus drops
unsigned BLS messages; update the block-handling in the loop inside
run_head_change (the for ts in &left_chain block) to also collect and replay the
unsigned BLS messages returned by messages_for_block(), resolving their sender
keys with resolve_to_key and passing them to add_to_selected_msgs just like
smsgs so BLS messages aren’t lost (this will ensure get_pending_messages sees
reverted BLS messages restored).
---
Duplicate comments:
In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 320-339: The code currently falls back to actor.sequence and
writes that to state_nonce_cache even if resolve_to_key() or
api.messages_for_tipset() failed; change the logic so we only update
state_nonce_cache (state_nonce_cache.push(nk, next_nonce)) when both
resolve_to_key(...) and api.messages_for_tipset(...) returned Ok. Specifically,
keep the initial actor = api.get_actor_after(...) and next_nonce initialization,
but move the state_nonce_cache.push call inside the if let (Ok(resolved),
Ok(messages)) = (...) { ... } block (or otherwise gate it on both successes) so
that failures from resolve_to_key or messages_for_tipset are not silently
cached.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1a044529-aa40-4d18-87d3-75ed7ce3ef9b
📒 Files selected for processing (3)
src/message_pool/msgpool/mod.rssrc/message_pool/msgpool/msg_pool.rssrc/message_pool/msgpool/selection.rs
There was a problem hiding this comment.
Not a fault of this PR, but this bash script is horrible. 😆 We need to convert it to proper Rust code to make it maintainable.
There was a problem hiding this comment.
🧩 Analysis chain
@LesnyRumcajs, I've created a GitHub issue to track converting the calibnet wallet check bash scripts to Rust. You can find it linked above once the script runs. The issue covers both calibnet_delegated_wallet_check.sh and calibnet_wallet_check.sh, references this PR and comment, and is assigned to you.
🏁 Script executed:
#!/bin/bash
gh issue create \
--repo ChainSafe/forest \
--title "Convert calibnet wallet check bash scripts to Rust" \
--body "## Summary
The bash scripts used for calibnet wallet integration tests (e.g., \`scripts/tests/calibnet_delegated_wallet_check.sh\`, \`scripts/tests/calibnet_wallet_check.sh\`) have grown complex and are difficult to maintain. They should be converted to proper Rust code (e.g., using a test harness or integration test binary) to improve readability, maintainability, and type safety.
## Motivation
As noted in PR `#6788` (comment: https://github.com/ChainSafe/forest/pull/6788#discussion_r3026612184), the bash scripts are increasingly complex, with repeated retry loops, manual string parsing, and ad-hoc error handling that is error-prone and hard to extend.
## Suggested Approach
- Rewrite the wallet calibnet check scripts as Rust integration tests or a dedicated binary under \`scripts/tests/\` or a suitable location in the Rust workspace.
- Ensure feature parity with the existing bash scripts (delegated wallet creation, funding, balance polling, remote wallet operations, etc.).
## References
- PR: https://github.com/ChainSafe/forest/pull/6788
- Requested by: `@LesnyRumcajs`" \
--assignee LesnyRumcajs
Repository: ChainSafe/forest
Length of output: 47
@LesnyRumcajs, the issue has been created and assigned to you: #6850 — Convert calibnet wallet check bash scripts to Rust.
🧠 Learnings used
Learnt from: sudo-shashank
Repo: ChainSafe/forest PR: 6710
File: src/wallet/subcommands/wallet_cmd.rs:641-649
Timestamp: 2026-03-11T10:10:29.857Z
Learning: In `src/wallet/subcommands/wallet_cmd.rs` (ChainSafe/forest, PR `#6710`), `resolve_target_address` intentionally returns `is_0x_recipient = false` for Filecoin-form f410f addresses (even though they resolve to Delegated/ETH-mapped addresses) and `true` only for `0x`-prefixed hex input. This is a deliberate choice to mirror Lotus CLI behavior, where method selection (InvokeContract vs METHOD_SEND) is driven by whether the user typed a `0x` address, not the resolved address protocol. The test `test_resolve_method_num_send_to_delegated` (f1 → f410f → METHOD_SEND) reflects this intentional design. Do not flag the `false` return for f410f-form Filecoin addresses as a bug.
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5886
File: interop-tests/src/tests/go_app/gen.go:29-29
Timestamp: 2025-08-07T13:39:15.107Z
Learning: Auto-generated files like those created by rust2go (indicated by "Generated by rust2go. Please DO NOT edit this C part manually." comment) in the Forest project should be skipped during code review as they are not intended for manual editing.
Learnt from: sudo-shashank
Repo: ChainSafe/forest PR: 6710
File: src/wallet/subcommands/wallet_cmd.rs:523-540
Timestamp: 2026-03-10T06:27:11.071Z
Learning: In the ChainSafe/forest Filecoin wallet implementation (src/wallet/subcommands/wallet_cmd.rs), the EVM method_num (EVMMethod::InvokeContract or EAMMethod::CreateExternal) should be selected based on whether the *recipient* is an ETH/0x address (is_0x_recipient) OR the *sender* is a delegated f4 address (is_eth_address(&from)). Non-delegated senders (f1/secp256k1, f3/BLS) are allowed to send to 0x recipients using EVMMethod::InvokeContract with their native signature type — this mirrors Lotus behavior. The signing path (EIP-1559 vs. standard) is separate and only uses EIP-1559 for SignatureType::Delegated senders.
Learnt from: LesnyRumcajs
Repo: ChainSafe/forest PR: 5907
File: src/rpc/methods/state.rs:523-570
Timestamp: 2025-08-06T15:44:33.467Z
Learning: LesnyRumcajs prefers to rely on BufWriter's Drop implementation for automatic flushing rather than explicit flush() calls in Forest codebase.
Learnt from: LesnyRumcajs
Repo: ChainSafe/forest PR: 6491
File: docs/docs/users/getting_started/install.md:134-136
Timestamp: 2026-01-28T14:28:39.596Z
Learning: In the Forest project, LesnyRumcajs prefers maintaining a playful, humorous tone in documentation rather than strictly formal/corporate language, as evidenced by keeping editor-choice jokes in the systemd installation guide.
Learnt from: sudo-shashank
Repo: ChainSafe/forest PR: 6788
File: src/rpc/methods/mpool.rs:301-306
Timestamp: 2026-03-25T09:06:55.615Z
Learning: In Forest's `MpoolGetNonce` handler (src/rpc/methods/mpool.rs), returning bare `mpool.get_sequence()` without going through `NonceTracker` is intentional and matches Lotus behavior. Lotus documents `MpoolGetNonce` as explicitly non-atomic and recommends using `MpoolPushMessage` instead. Do not flag the discrepancy between `MpoolGetNonce` (mpool.get_sequence) and `MpoolPushMessage` (NonceTracker.sign_and_push) as a bug.
| // the ID→key mapping is reorg-stable. We currently resolve against the | ||
| // tipset's parent state, which is safe but does not provide the same | ||
| // reorg-safety guarantee. | ||
| // See https://github.com/filecoin-project/lotus/blob/006da4c7e1c1c29ac02b32112c0d205e4085ba35/chain/stmgr/stmgr.go#L347 |
There was a problem hiding this comment.
What can we do about it? Could the finality calculator integration from @hanabi1224 help? Should we have a follow-up issue?
| } | ||
|
|
||
| fn messages_for_tipset(&self, ts: &Tipset) -> Result<Vec<ChainMessage>, Error> { | ||
| Ok((*ChainStore::messages_for_tipset(self, ts)?).clone()) |
There was a problem hiding this comment.
Not necessarily to do in this PR, but I wonder if cloning the messages around is necessary. It makes code easier to read but perhaps at a performance hit (it might be acceptable, though - is it done once per epoch?)
|
|
||
| /// Acquire an async lock for the given address. The returned guard must be | ||
| /// held for the duration of the nonce-assign + sign + push critical section. | ||
| pub async fn take_lock(&self, addr: Address) -> OwnedMutexGuard<()> { |
There was a problem hiding this comment.
This map grows ounbounded. On a long-running public RPC node this could be a vector of attack.
A potential fix would be to drop entries for which there are no more Arc consumers (strong_count == 0).
| let actor = api.get_actor_after(addr, cur_ts)?; | ||
| let mut next_nonce = actor.sequence; | ||
|
|
||
| if let (Ok(resolved), Ok(messages)) = ( |
There was a problem hiding this comment.
should we have at least a warning if this fails?
| if let Some(temp) = rmsgs.get_mut(from) { | ||
| if temp.get_mut(&sequence).is_some() { | ||
| temp.remove(&sequence); | ||
| } else { | ||
| remove(from, pending, sequence, true)?; | ||
| let resolved = resolve_to_key(api, key_cache, from, ts)?; | ||
| remove(&resolved, pending, sequence, true)?; | ||
| } | ||
| } else { | ||
| remove(from, pending, sequence, true)?; | ||
| let resolved = resolve_to_key(api, key_cache, from, ts)?; | ||
| remove(&resolved, pending, sequence, true)?; | ||
| } | ||
| Ok(()) |
There was a problem hiding this comment.
This could be simplified, no?
Summary of changes
Changes introduced in this pull request:
Nonce calculation:
get_state_sequencenow scans messages included in the current tipset to find the true highest nonce, rather than relying solely onactor.sequencewhich can lag.TipsetKey,Address)-keyed state nonce cache to avoid redundant tipset scans.Nonce serialization:
MpoolLockerpreventing concurrent requests from reading stale state.NonceTrackerthat assigns nonces sequentially and persists them to the datastore, this bridges the gap between "messages broadcast to the network but not yet on-chain" and "node restart wiping in-memory state". without it, any restart while messages are in-flight would cause nonce collisions.Nonce gap enforcement:
MsgSet::addgains a strict flag that rejects messages exceedingMAX_NONCE_GAP(4 for trusted, 0 for untrusted) and blocks replace-by-fee when a gap exists. Reorg re-additions use strict=false.Early balance check in
MpoolPushMessageto reject before consuming a nonce.forest-wallet listcmd now shows a nonce column with table formatting.Reference issue to close (if applicable)
Closes #4899
Closes #3628
Closes #2927
Other information and links
Change checklist
Outside contributions
Summary by CodeRabbit
Bug Fixes
New Features
Improvements
Tests
Chores