feat(core): implement consensus/timer by iamquang95 · Pull Request #433 · NethermindEth/pluto

iamquang95 · 2026-05-22T09:57:42Z

Fix: #158

This PR:

Port the consensus/timer to pluto.
Use cancellable Tokio sleep futures as the Rust timer primitive.
Cover timer behavior with parity-focused unit tests, including proposal timeout and eager timer deadlines.

iamquang95 · 2026-05-22T10:03:52Z

@claude review this PR

claude · 2026-05-22T10:04:09Z

Claude finished @iamquang95's task in 4m 43s —— View job

PR Review: `feat(core): implement consensus/timer`

Overall this is a clean, well-documented port of charon/core/consensus/timer/roundtimer.go. The strategy selection logic, deadline math, eager-double-linear state, and feature gating all match the Go reference. The Rust-side improvements are tasteful: cancellation by drop, a proper Error/Result surface, and Mutex poison handled instead of unwrapping.

What looks good

Parity with the three Go timer strategies (increasing, eager_dlinear, linear) and GetRoundTimerFunc precedence. The Rust selector in get_round_timer_func (timer.rs:292-310) mirrors the Go branching exactly.
Faithful reproduction of the upstream Go bug in linearRoundTimer where 200*(round-1)+200 is in nanoseconds despite the comment claiming "starting at 400 milliseconds". The Rust comment at timer.rs:357-358 calls this out explicitly and the test expectations encode nanos (Duration::from_nanos(400) etc.) — good behavior to make the parity intentional and visible.
Eager double-linear state machine: deadline math (first_deadline + timeout) matches Go's first.Add(timeout). The test walkthrough at timer.rs:441-491 covers the round-2/round-3 doubling case nicely.
FeatureSetGuard correctly serializes tests via a static Mutex and restores GLOBAL_STATE on drop. Drop order leaves the lock as the last to be released — exactly what you want.
Cancellation via Drop of the RoundTimerFuture replaces Go's explicit stop() callback. Cleaner ergonomics.
Defensive overflow handling: checked_mul/checked_add/checked_sub everywhere arithmetic touches i64/Duration — strictly better than Go which would silently produce negative durations.

Suggestions

Misleading error variant in duration_from_nanos (timer.rs:372-376) — a failed u64::try_from(timeout_nanos) is mapped to Error::InvalidRound, but at this point the round has already been validated by ensure_non_negative_round. A negative timeout_nanos here would represent an arithmetic overflow during timeout computation, not an invalid round. Suggest Error::DurationOverflow { round } to match intent. In practice this path is currently unreachable for non-negative rounds (since 200*(round-1)+200 >= 0 for round >= 0), but the variant choice is what a future reader will use to diagnose. Fix this →
Slightly misleading doc comment (timer.rs:231) — // Deadline is either double the first timeout. reads ambiguously. The code is first_deadline + timeout, which is "double the timeout from the original start time". Consider: // Deadline = first_deadline + timeout (which is the original start + 2 * timeout).
TimerType::eager divergence from Go (timer.rs:89-91) — Go's implementation uses strings.Contains(string(t), "eager"), which would automatically pick up any future "eager_*" timer types. Rust uses an explicit matches!(self, Self::EagerDoubleLinear). This is arguably better (type-safe, no string sniffing), but worth flagging as a divergence — if Charon adds another eager-mode timer upstream, the Rust side needs an extra arm.
PROPOSAL_TIMEOUT is private (timer.rs:45) while INC_ROUND_START, INC_ROUND_INCREASE, LINEAR_ROUND_INC are all pub. Go inlines 1500 * time.Millisecond so there's no upstream constant either — but for consistency in Pluto, consider exposing it (or downgrading the others to private if they're not externally referenced yet). Minor.
expect on global feature lock (timer.rs:314-316) — feature_enabled panics on a poisoned global RwLock. The timer methods themselves propagate TimerStatePoisoned for their local mutex but go through feature_enabled which doesn't. For consistency you could lift this into a Result, though existing pluto-featureset callers seem to follow the same panic-on-poison pattern, so this may be fine as-is.
Round 0 is silently accepted by ensure_non_negative_round (timer.rs:378-384) — produces a 750ms timer for Increasing, 0 for Linear, 200ns for the linear-timer non-1 branch. QBFT rounds are 1-indexed, so this may not matter, but worth confirming whether the future consumer ever passes round 0. The Go code has the same behavior.
Optional nit on the negative-round test (timer.rs:647-664) — currently only -4 is exercised. Adding i64::MIN would give confidence that the error formatting path doesn't itself overflow.
feature_enabled called multiple times in get_round_timer_func (timer.rs:292-310) — re-locks the global RwLock each time. Not hot, mirrors Go's pattern, fine. Could be hoisted into locals if you ever profile and care.

Verdict

LGTM modulo the small Error::InvalidRound → Error::DurationOverflow correction (item 1) and the doc-comment polish (item 2). Everything else is style/nit. The parity discipline (preserving the upstream nanos bug intentionally, plus a comment explaining it) is exactly what AGENTS.md asks for.

Note: I was unable to run cargo fmt --check / cargo clippy / cargo test locally (cargo commands require approval in this environment). Please verify the workspace gates pass in CI.
· iamquang95/core-consensus-timer

iamquang95 · 2026-05-22T10:11:49Z

All good, no need to fix any findings

mskrzypkows

LGTM

varex83

LGTM

varex83agent

Overall

Clean, well-documented port of Charon's consensus/timer package. The cancellable Tokio sleep model is a nice Rust-native fit, parity with the Go reference (including the deliberate preservation of Charon's nanoseconds-vs-milliseconds quirk in linear_timer_timeout) is faithful, and tests are well-structured around tokio's paused clock plus a serialised featureset guard. No bug or major severity issues. Findings below are mostly cleanup/consistency suggestions:

Error model — TimerStatePoisoned is reachable only on programmer error and is inconsistent with how GLOBAL_STATE poisoning is handled (.expect); duration_from_nanos reports InvalidRound for what is really a DurationOverflow; DurationOverflow/DeadlineOverflow carry identical payloads and could merge.
Naming — linear_round_timeout (used by EagerDoubleLinear) vs linear_timer_timeout (used by LinearRoundTimer) differ only by one word but compute very different things; with_duty shadows the workspace's builder-mutator convention; the trait method RoundTimer::timer collides with locals named timer.
API surface — #[derive(Default)] would replace three hand-written Default impls that just call new(); Clone on IncreasingRoundTimer/LinearRoundTimer is never used through the trait object.
Tests — missing coverage for round = 0, round = i64::MAX, the doubling-with-ProposalTimeout combination, and FEATURESET_TEST_LOCK cascades on test-panic poisoning.

Verdict: COMMENT — leave to author discretion. None of the findings block merge; the major ones are worth addressing in this PR, the rest are polish.

varex83agent · 2026-05-22T13:28:22Z

+        .ok_or(Error::DurationOverflow { round })
+}
+
+fn linear_timer_timeout(round: i64) -> Result<Duration> {


Confusingly similar helper names: linear_round_timeout (line 345, used by EagerDoubleLinearRoundTimer::timer) and linear_timer_timeout (line 354, used by LinearRoundTimer::timer) differ by one word but compute very different things (LINEAR_ROUND_INC * round vs the 200*(round-1) + 200 nanoseconds formula). A future refactor could easily swap them at the two call sites (lines 223 and 284) with no compile error. Suggest renaming the second to something like linear_subsequent_round_timeout or moving it onto LinearRoundTimer as an inherent method. Charon avoided this by inlining the expression directly inside linearRoundTimer.Timer.

varex83agent · 2026-05-22T13:28:22Z

+        let mut first_deadlines = self
+            .first_deadlines
+            .lock()
+            .map_err(|_| Error::TimerStatePoisoned)?;


Inconsistent poison handling: this line maps the per-instance Mutex poison to Error::TimerStatePoisoned, but feature_enabled (line 315) — also called inside timer() via proposal_timeout_duration — .expects on GLOBAL_STATE poisoning. So one poisoned-lock path returns a recoverable error while the other panics, inside the same function. Per workspace convention (see crates/dkg/src/sync/client.rs:247,254), poisoned-lock = .expect. Suggest dropping TimerStatePoisoned (line 68-70) and using .expect("timer state lock poisoned") here for consistency; the critical section is too small to plausibly poison the lock anyway.

varex83agent · 2026-05-22T13:28:22Z

+}
+
+fn duration_from_nanos(timeout_nanos: i64, round: i64) -> Result<Duration> {
+    let timeout_nanos = u64::try_from(timeout_nanos).map_err(|_| Error::InvalidRound { round })?;


Wrong error variant. By the time duration_from_nanos is reached, ensure_non_negative_round(round)? (line 355) has already proven round >= 0. The u64::try_from(timeout_nanos) conversion here can only fail if the computed timeout_nanos is negative — which is an arithmetic-overflow condition, not an invalid-round one. Should be Error::DurationOverflow { round }. (Note: today this branch is actually unreachable for round >= 0 since (round-1)*200 + 200 >= 0, but using the wrong variant if it ever becomes reachable would be misleading.)

varex83agent · 2026-05-22T13:28:22Z

+    },
+    /// Timer duration arithmetic overflowed.
+    #[error("timer duration overflow for round: {round}")]
+    DurationOverflow {


DurationOverflow (line 58) and DeadlineOverflow (line 64) both carry only round: i64 and originate from interchangeable arithmetic paths — DurationOverflow from round arithmetic (Duration::checked_mul/checked_add/u32::try_from), DeadlineOverflow from Instant::checked_add (line 402). From a caller's perspective the failure is the same: "timer arithmetic overflowed for round N". Consider collapsing into a single Overflow { round } (or TimerArithmeticOverflow { round }).

varex83agent · 2026-05-22T13:28:22Z

+    }
+}
+
+impl Default for IncreasingRoundTimer {


All three timer structs have a manual Default impl (lines 146, 209, 269) that just calls Self::new(), where new() itself produces duty: None + (for EagerDoubleLinear) an empty HashMap. The whole boilerplate can be replaced with #[derive(Default)] on each struct — Option, Mutex<HashMap<_,_>>, and HashMap all implement Default. Either drop new() in favour of T::default() / T::with_duty(...), or keep new() and derive Default to remove the three manual impls.

varex83agent · 2026-05-22T13:28:23Z

+#[derive(Debug)]
+pub struct EagerDoubleLinearRoundTimer {
+    duty: Option<Duty>,
+    first_deadlines: Mutex<HashMap<i64, Instant>>,


std::sync::Mutex is the right choice here — the critical section (lines 226-240) is short, synchronous, and the guard is dropped before the returned future is constructed. A future contributor reviewing async code might be tempted to "fix" this to tokio::sync::Mutex, which would be both unnecessary and would force timer() to become async. A one-line comment on the field — e.g. // std::sync::Mutex: critical section is bounded and no guard escapes this function — would prevent that.

Also worth a brief note that first_deadlines grows monotonically with rounds (matching Charon's firstDeadlines map[int64]time.Time — same parity-preserved property) so a future reader doesn't add eviction logic prematurely.

varex83agent · 2026-05-22T13:28:23Z

+    }
+
+    /// Creates an increasing round timer for a duty.
+    pub fn with_duty(duty: Duty) -> Self {


with_duty shadows the workspace's with_* convention, which is overwhelmingly a consuming builder mutator (e.g. with_timeout, with_relays, with_external_ip in crates/p2p, crates/peerinfo, crates/parsigex). The three with_duty constructors here are alternative-new associated functions, not mutators on an existing instance. Rust API Guidelines (C-CTOR) suggest naming after the input or distinguishing role. Consider IncreasingRoundTimer::for_duty(duty) (and same for the other two) to avoid clashing with workspace conventions.

varex83agent · 2026-05-22T13:28:23Z

+        }
+    }
+
+    fn wake_duration(duration: Duration) -> Duration {


nit: wake_duration clamps any duration to a 1ms minimum. The reason — that tokio's paused clock cannot reliably wake a sub-microsecond sleep_until — isn't obvious from the function name or body. A one-line doc comment explaining the paused-time edge case (and why the helper exists for the linear_round_timer test with its 400/600/800-nanosecond timeouts) would help future readers.

varex83agent · 2026-05-22T13:28:23Z

+    /// Round produced a negative duration.
+    #[error("invalid consensus round: {round}")]
+    InvalidRound {
+        /// Invalid round.


nit: the round: i64 field-level doc comments on each Error variant (lines 53, 59, 65) restate the variant title and add no information — the variant-level #[error("...round: {round}")] already conveys what round means. Either tighten to a single-token description ("The offending round number.") or accept the duplication.

varex83agent · 2026-05-22T13:28:23Z

+    ///
+    /// The caller owns cancellation by dropping the returned future before it
+    /// fires.
+    fn timer(&self, round: i64) -> Result<RoundTimerFuture>;


nit: the trait method RoundTimer::timer shares its name with both the type and common local variable names — call sites read let timer = ...; timer.timer(round). Charon also uses Timer for the method, so this preserves a 1:1 mapping with upstream (which has value for porters), but a Rust-idiomatic alternative like start(round) or deadline(round) would read more clearly. Up to author discretion.

iamquang95 added 4 commits May 22, 2026 14:48

feature(core): implement timer

141ea58

test: add more tests on timer

3ecd671

fix: remove unused traits

71892c8

fix: implement default for Timers

a368820

iamquang95 requested review from emlautarom1, mskrzypkows, therustmonk and varex83 May 22, 2026 09:57

iamquang95 changed the title ~~Iamquang95/core consensus timer~~ feat(core): implement consensus/timer May 22, 2026

mskrzypkows approved these changes May 22, 2026

View reviewed changes

varex83 approved these changes May 22, 2026

View reviewed changes

varex83agent reviewed May 22, 2026

View reviewed changes

Conversation

iamquang95 commented May 22, 2026

Uh oh!

iamquang95 commented May 22, 2026

Uh oh!

claude Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat(core): implement consensus/timer

What looks good

Suggestions

Verdict

Uh oh!

iamquang95 commented May 22, 2026

Uh oh!

mskrzypkows left a comment

Choose a reason for hiding this comment

Uh oh!

varex83 left a comment

Choose a reason for hiding this comment

Uh oh!

varex83agent left a comment

Choose a reason for hiding this comment

Overall

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

claude Bot commented May 22, 2026 •

edited

Loading

PR Review: `feat(core): implement consensus/timer`