feat(core): implement consensus/timer#433
Conversation
|
@claude review this PR |
|
Claude finished @iamquang95's task in 4m 43s —— View job PR Review:
|
|
All good, no need to fix any findings |
varex83agent
left a comment
There was a problem hiding this comment.
Overall
Clean, well-documented port of Charon's consensus/timer package. The cancellable Tokio sleep model is a nice Rust-native fit, parity with the Go reference (including the deliberate preservation of Charon's nanoseconds-vs-milliseconds quirk in linear_timer_timeout) is faithful, and tests are well-structured around tokio's paused clock plus a serialised featureset guard. No bug or major severity issues. Findings below are mostly cleanup/consistency suggestions:
- Error model —
TimerStatePoisonedis reachable only on programmer error and is inconsistent with howGLOBAL_STATEpoisoning is handled (.expect);duration_from_nanosreportsInvalidRoundfor what is really aDurationOverflow;DurationOverflow/DeadlineOverflowcarry identical payloads and could merge. - Naming —
linear_round_timeout(used byEagerDoubleLinear) vslinear_timer_timeout(used byLinearRoundTimer) differ only by one word but compute very different things;with_dutyshadows the workspace's builder-mutator convention; the trait methodRoundTimer::timercollides with locals namedtimer. - API surface —
#[derive(Default)]would replace three hand-writtenDefaultimpls that just callnew();CloneonIncreasingRoundTimer/LinearRoundTimeris never used through the trait object. - Tests — missing coverage for
round = 0,round = i64::MAX, the doubling-with-ProposalTimeout combination, andFEATURESET_TEST_LOCKcascades on test-panic poisoning.
Verdict: COMMENT — leave to author discretion. None of the findings block merge; the major ones are worth addressing in this PR, the rest are polish.
| .ok_or(Error::DurationOverflow { round }) | ||
| } | ||
|
|
||
| fn linear_timer_timeout(round: i64) -> Result<Duration> { |
There was a problem hiding this comment.
Confusingly similar helper names: linear_round_timeout (line 345, used by EagerDoubleLinearRoundTimer::timer) and linear_timer_timeout (line 354, used by LinearRoundTimer::timer) differ by one word but compute very different things (LINEAR_ROUND_INC * round vs the 200*(round-1) + 200 nanoseconds formula). A future refactor could easily swap them at the two call sites (lines 223 and 284) with no compile error. Suggest renaming the second to something like linear_subsequent_round_timeout or moving it onto LinearRoundTimer as an inherent method. Charon avoided this by inlining the expression directly inside linearRoundTimer.Timer.
| let mut first_deadlines = self | ||
| .first_deadlines | ||
| .lock() | ||
| .map_err(|_| Error::TimerStatePoisoned)?; |
There was a problem hiding this comment.
Inconsistent poison handling: this line maps the per-instance Mutex poison to Error::TimerStatePoisoned, but feature_enabled (line 315) — also called inside timer() via proposal_timeout_duration — .expects on GLOBAL_STATE poisoning. So one poisoned-lock path returns a recoverable error while the other panics, inside the same function. Per workspace convention (see crates/dkg/src/sync/client.rs:247,254), poisoned-lock = .expect. Suggest dropping TimerStatePoisoned (line 68-70) and using .expect("timer state lock poisoned") here for consistency; the critical section is too small to plausibly poison the lock anyway.
| } | ||
|
|
||
| fn duration_from_nanos(timeout_nanos: i64, round: i64) -> Result<Duration> { | ||
| let timeout_nanos = u64::try_from(timeout_nanos).map_err(|_| Error::InvalidRound { round })?; |
There was a problem hiding this comment.
Wrong error variant. By the time duration_from_nanos is reached, ensure_non_negative_round(round)? (line 355) has already proven round >= 0. The u64::try_from(timeout_nanos) conversion here can only fail if the computed timeout_nanos is negative — which is an arithmetic-overflow condition, not an invalid-round one. Should be Error::DurationOverflow { round }. (Note: today this branch is actually unreachable for round >= 0 since (round-1)*200 + 200 >= 0, but using the wrong variant if it ever becomes reachable would be misleading.)
| }, | ||
| /// Timer duration arithmetic overflowed. | ||
| #[error("timer duration overflow for round: {round}")] | ||
| DurationOverflow { |
There was a problem hiding this comment.
DurationOverflow (line 58) and DeadlineOverflow (line 64) both carry only round: i64 and originate from interchangeable arithmetic paths — DurationOverflow from round arithmetic (Duration::checked_mul/checked_add/u32::try_from), DeadlineOverflow from Instant::checked_add (line 402). From a caller's perspective the failure is the same: "timer arithmetic overflowed for round N". Consider collapsing into a single Overflow { round } (or TimerArithmeticOverflow { round }).
| } | ||
| } | ||
|
|
||
| impl Default for IncreasingRoundTimer { |
There was a problem hiding this comment.
All three timer structs have a manual Default impl (lines 146, 209, 269) that just calls Self::new(), where new() itself produces duty: None + (for EagerDoubleLinear) an empty HashMap. The whole boilerplate can be replaced with #[derive(Default)] on each struct — Option, Mutex<HashMap<_,_>>, and HashMap all implement Default. Either drop new() in favour of T::default() / T::with_duty(...), or keep new() and derive Default to remove the three manual impls.
| #[derive(Debug)] | ||
| pub struct EagerDoubleLinearRoundTimer { | ||
| duty: Option<Duty>, | ||
| first_deadlines: Mutex<HashMap<i64, Instant>>, |
There was a problem hiding this comment.
std::sync::Mutex is the right choice here — the critical section (lines 226-240) is short, synchronous, and the guard is dropped before the returned future is constructed. A future contributor reviewing async code might be tempted to "fix" this to tokio::sync::Mutex, which would be both unnecessary and would force timer() to become async. A one-line comment on the field — e.g. // std::sync::Mutex: critical section is bounded and no guard escapes this function — would prevent that.
Also worth a brief note that first_deadlines grows monotonically with rounds (matching Charon's firstDeadlines map[int64]time.Time — same parity-preserved property) so a future reader doesn't add eviction logic prematurely.
| } | ||
|
|
||
| /// Creates an increasing round timer for a duty. | ||
| pub fn with_duty(duty: Duty) -> Self { |
There was a problem hiding this comment.
with_duty shadows the workspace's with_* convention, which is overwhelmingly a consuming builder mutator (e.g. with_timeout, with_relays, with_external_ip in crates/p2p, crates/peerinfo, crates/parsigex). The three with_duty constructors here are alternative-new associated functions, not mutators on an existing instance. Rust API Guidelines (C-CTOR) suggest naming after the input or distinguishing role. Consider IncreasingRoundTimer::for_duty(duty) (and same for the other two) to avoid clashing with workspace conventions.
| } | ||
| } | ||
|
|
||
| fn wake_duration(duration: Duration) -> Duration { |
There was a problem hiding this comment.
nit: wake_duration clamps any duration to a 1ms minimum. The reason — that tokio's paused clock cannot reliably wake a sub-microsecond sleep_until — isn't obvious from the function name or body. A one-line doc comment explaining the paused-time edge case (and why the helper exists for the linear_round_timer test with its 400/600/800-nanosecond timeouts) would help future readers.
| /// Round produced a negative duration. | ||
| #[error("invalid consensus round: {round}")] | ||
| InvalidRound { | ||
| /// Invalid round. |
There was a problem hiding this comment.
nit: the round: i64 field-level doc comments on each Error variant (lines 53, 59, 65) restate the variant title and add no information — the variant-level #[error("...round: {round}")] already conveys what round means. Either tighten to a single-token description ("The offending round number.") or accept the duplication.
| /// | ||
| /// The caller owns cancellation by dropping the returned future before it | ||
| /// fires. | ||
| fn timer(&self, round: i64) -> Result<RoundTimerFuture>; |
There was a problem hiding this comment.
nit: the trait method RoundTimer::timer shares its name with both the type and common local variable names — call sites read let timer = ...; timer.timer(round). Charon also uses Timer for the method, so this preserves a 1:1 mapping with upstream (which has value for porters), but a Rust-idiomatic alternative like start(round) or deadline(round) would read more clearly. Up to author discretion.
Fix: #158
This PR: