Skip to content

tool: forest-dev export-tipset-lookup#6827

Open
hanabi1224 wants to merge 2 commits intomainfrom
hm/export-tipset-lookup-tool
Open

tool: forest-dev export-tipset-lookup#6827
hanabi1224 wants to merge 2 commits intomainfrom
hm/export-tipset-lookup-tool

Conversation

@hanabi1224
Copy link
Copy Markdown
Contributor

@hanabi1224 hanabi1224 commented Mar 31, 2026

Summary of changes

This PR implements an internal tool to export epoch -> tipset key lookup AMT to ForestCAR

skip-length=1

❯ forest-dev export-tipset-lookup --chain calibnet --output amt_calibnet.forest.car.zst
2026-03-31T11:51:29.956884Z  INFO forest::daemon::db_util: Loaded 1 CARs
2026-03-31T11:51:29.959746Z  INFO forest::genesis: Initialized genesis: bafy2bzacecyaggy24wol5ruvs6qm73gjibs2l2iyhcqmvi7r7a4ph7zx3yqd4
Exported tipset lookup AMT with root CID: bafy2bzacedjd6dfpcajymv2bqqfhziucp2q2k22mkxmskmddndrjfnrxbpdzy, len: 512514, size: 450 MiB

❯ du -hs amt_calibnet.forest.car.zst
375M    amt_calibnet.forest.car.zst

❯ forest-dev export-tipset-lookup --chain mainnet --output amt_mainnet.forest.car.zst
2026-03-31T11:55:34.153393Z  INFO forest::daemon::db_util: Loaded 1 CARs
2026-03-31T11:55:34.158410Z  INFO forest::genesis: Initialized genesis: bafy2bzacecnamqgqmifpluoeldx7zzglxcljo6oja4vrmtj7432rphldpdmm2
Exported tipset lookup AMT with root CID: bafy2bzaced466zkzi2g6ngk5vy2ahoaryhritxoxsicmkswoytaa5ewrwassu, len: 841106, size: 1.21 GiB

❯ du -hs amt_mainnet.forest.car.zst
1006M   amt_mainnet.forest.car.zst

skip-length=10

❯ forest-dev export-tipset-lookup --chain calibnet --skip-length 10 --output amt_calibnet.forest.car.zst
2026-03-31T12:31:51.517521Z  INFO forest::daemon::db_util: Loaded 1 CARs
2026-03-31T12:31:51.524837Z  INFO forest::genesis: Initialized genesis: bafy2bzacecyaggy24wol5ruvs6qm73gjibs2l2iyhcqmvi7r7a4ph7zx3yqd4
Exported tipset lookup AMT with root CID: bafy2bzacebqgoqknbpwzvryoiev6ewasncns6klv5res7s3xzerehz2i5lshm, len: 417171, size: 75 MiB, took 2m 36s 861ms 907us 797ns

❯ du -hs amt_calibnet.forest.car.zst
72M     amt_calibnet.forest.car.zst

❯ forest-dev export-tipset-lookup --chain mainnet --skip-length 10 --output amt_mainnet.forest.car.zst
2026-03-31T12:35:52.423693Z  INFO forest::daemon::db_util: Loaded 1 CARs
2026-03-31T12:35:52.430963Z  INFO forest::genesis: Initialized genesis: bafy2bzacecnamqgqmifpluoeldx7zzglxcljo6oja4vrmtj7432rphldpdmm2
Exported tipset lookup AMT with root CID: bafy2bzacecc4ib5rpjifggs4tb6dx4r5r3evq2iiwu2con5m3cqteddahejc6, len: 689423, size: 173.3 MiB, took 8m 45s 768ms 244us 933ns

❯ du -hs amt_mainnet.forest.car.zst
158M    amt_mainnet.forest.car.zst

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • New Features
    • New CLI command to export tipset-epoch → tipset-key lookup tables into a CAR file, with range and stride options and export statistics.
    • Export API now accepts custom root lists for flexible CAR exports.
    • Added blockstore size reporting for quick database introspection.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

Walkthrough

Added a new CLI subcommand to export tipset-epoch → tipset-key mappings as a ForestCAR, extended MemoryDB CAR export to accept explicit roots, and refactored the existing export to delegate to the new root-parameterized export method.

Changes

Cohort / File(s) Summary
Database API Enhancement
src/db/memory.rs
Added pub fn blockstore_len(&self) -> usize; added export_forest_car_with_roots(..., roots: NonEmpty<Cid>, writer: &mut W); refactored export_forest_car(...) to compute HEAD roots and delegate to the new method.
New CLI Subcommand
src/dev/subcommands/export_tipset_lookup_cmd.rs, src/dev/subcommands/mod.rs
Added ExportTipsetLookupCommand and Subcommand::ExportTipsetLookup variant. Command opens DB (or uses provided db root), builds ChainStore, iterates tipsets with optional from/to/skip filters, constructs and flushes an AMT mapping epoch→tipset-key, then exports the AMT as a ForestCAR via export_forest_car_with_roots.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI User
    participant Cmd as ExportTipsetLookupCommand
    participant DB as Database
    participant Store as ChainStore
    participant AMT as AMT (In-Memory)
    participant CAR as ForestCAR Exporter
    participant File as Output File

    CLI->>Cmd: run(chain, db?, from?, to?, skip?, output)
    Cmd->>DB: open_db(db_root)
    DB-->>Cmd: db instance
    Cmd->>Store: ChainStore::new(db, genesis)
    Store-->>Cmd: store ready
    Cmd->>Store: heaviest_tipset()
    Store-->>Cmd: tipset

    loop iterate tipsets backward
        Cmd->>Store: parent_tipset(current)
        Store-->>Cmd: parent tipset
        Cmd->>Cmd: filter by epoch (from/to/skip)
        alt include
            Cmd->>AMT: insert(epoch -> tipset_key)
        end
    end

    Cmd->>AMT: flush()
    AMT-->>Cmd: root CID
    Cmd->>File: create output file
    Cmd->>CAR: export_forest_car_with_roots(rootCID, writer)
    CAR->>File: write CAR blocks
    File-->>CAR: write complete
    CAR-->>Cmd: success
    Cmd-->>CLI: print stats & exit
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • LesnyRumcajs
  • akaladarshi
  • sudo-shashank
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'tool: forest-dev export-tipset-lookup' is concise and clearly describes the primary change: implementing a new CLI tool subcommand for exporting tipset lookup data, which aligns with the changeset adding the export-tipset-lookup command.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hm/export-tipset-lookup-tool
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch hm/export-tipset-lookup-tool

Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review March 31, 2026 12:07
@hanabi1224 hanabi1224 requested a review from a team as a code owner March 31, 2026 12:07
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team March 31, 2026 12:07
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
src/db/memory.rs (2)

28-30: Consider adding a doc comment for the new public method.

The blockstore_len method is public but lacks documentation. Per coding guidelines, public functions should have doc comments.

📝 Suggested doc comment
 impl MemoryDB {
+    /// Returns the total number of CIDs stored across both the in-memory
+    /// and persistent blockchain databases.
     pub fn blockstore_len(&self) -> usize {
         self.blockchain_db.read().len() + self.blockchain_persistent_db.read().len()
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/db/memory.rs` around lines 28 - 30, The public method blockstore_len is
missing a doc comment; add a concise Rust doc comment (///) above pub fn
blockstore_len(&self) -> usize describing what it returns (total number of
blocks/entries by summing self.blockchain_db.read().len() and
self.blockchain_persistent_db.read().len()) and any semantics (e.g., snapshot
consistency or that it sums in-memory and persistent stores), referencing the
involved fields blockchain_db and blockchain_persistent_db for clarity.

52-76: Consider adding a doc comment for the new public method.

The export_forest_car_with_roots method is a new public API entry point. Per coding guidelines, public functions should have doc comments explaining its purpose and parameters.

📝 Suggested doc comment
+    /// Exports the blockchain data to a ForestCAR file using the provided root CIDs.
+    ///
+    /// Unlike [`export_forest_car`], this method allows the caller to specify
+    /// explicit root CIDs rather than deriving them from the chain head.
     pub async fn export_forest_car_with_roots<W: tokio::io::AsyncWrite + Unpin>(
         &self,
         roots: NonEmpty<Cid>,
         writer: &mut W,
     ) -> anyhow::Result<()> {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/db/memory.rs` around lines 52 - 76, Add a doc comment for the public
async function export_forest_car_with_roots explaining its purpose (exporting a
deterministic CAR stream containing blocks from in-memory DBs), the parameters
(roots: NonEmpty<Cid> as the CAR roots, writer: AsyncWrite target for the CAR),
what it returns (anyhow::Result<()> on success/failure), and any important
behavior (merges blockchain_db and blockchain_persistent_db, sorts blocks to
make output deterministic, uses forest::Encoder to compress and write). Place
the doc comment immediately above pub async fn export_forest_car_with_roots so
it’s visible in generated docs and IDE tooltips.
src/dev/subcommands/mod.rs (1)

53-53: Add a doc comment for consistency with other variants.

Other variants like UpdateCheckpoints and ArchiveMissing have doc comments. Consider adding one for ExportTipsetLookup to maintain consistency.

📝 Suggested doc comment
     /// Find missing archival snapshots on the Forest Archive for a given epoch range
     ArchiveMissing(archive_missing_cmd::ArchiveMissingCommand),
+    /// Export epoch-to-tipset-key lookup AMT as a ForestCAR file
     ExportTipsetLookup(export_tipset_lookup_cmd::ExportTipsetLookupCommand),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/dev/subcommands/mod.rs` at line 53, Add a doc comment above the enum
variant ExportTipsetLookup(export_tipset_lookup_cmd::ExportTipsetLookupCommand)
to match the style used by other variants (e.g., UpdateCheckpoints,
ArchiveMissing); ensure the comment briefly describes the purpose of
ExportTipsetLookup and uses the same comment format and tone as the surrounding
variants for consistency.
src/dev/subcommands/export_tipset_lookup_cmd.rs (1)

98-103: Add error context to file creation.

Per coding guidelines, errors should have context. If file creation fails, the error message should indicate which file couldn't be created.

🔧 Suggested fix
+        use anyhow::Context as _;
         amt_db
             .export_forest_car_with_roots(
                 nunny::vec![root],
-                &mut tokio::fs::File::create(output).await?,
+                &mut tokio::fs::File::create(&output)
+                    .await
+                    .with_context(|| format!("failed to create output file: {}", output.display()))?,
             )
             .await?;

As per coding guidelines: "Add context with .context() when errors occur".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/dev/subcommands/export_tipset_lookup_cmd.rs` around lines 98 - 103, The
file creation call passed to amt_db.export_forest_car_with_roots lacks error
context; wrap the tokio::fs::File::create(output).await? result with a Context
(e.g., anyhow::Context) so failures include which output file couldn't be
created, then pass that file handle into amt_db.export_forest_car_with_roots
(i.e., replace the raw await? with .await.context(format!("failed to create
output file: {}", output))? and keep the rest of the call unchanged).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/db/memory.rs`:
- Around line 28-30: The public method blockstore_len is missing a doc comment;
add a concise Rust doc comment (///) above pub fn blockstore_len(&self) -> usize
describing what it returns (total number of blocks/entries by summing
self.blockchain_db.read().len() and self.blockchain_persistent_db.read().len())
and any semantics (e.g., snapshot consistency or that it sums in-memory and
persistent stores), referencing the involved fields blockchain_db and
blockchain_persistent_db for clarity.
- Around line 52-76: Add a doc comment for the public async function
export_forest_car_with_roots explaining its purpose (exporting a deterministic
CAR stream containing blocks from in-memory DBs), the parameters (roots:
NonEmpty<Cid> as the CAR roots, writer: AsyncWrite target for the CAR), what it
returns (anyhow::Result<()> on success/failure), and any important behavior
(merges blockchain_db and blockchain_persistent_db, sorts blocks to make output
deterministic, uses forest::Encoder to compress and write). Place the doc
comment immediately above pub async fn export_forest_car_with_roots so it’s
visible in generated docs and IDE tooltips.

In `@src/dev/subcommands/export_tipset_lookup_cmd.rs`:
- Around line 98-103: The file creation call passed to
amt_db.export_forest_car_with_roots lacks error context; wrap the
tokio::fs::File::create(output).await? result with a Context (e.g.,
anyhow::Context) so failures include which output file couldn't be created, then
pass that file handle into amt_db.export_forest_car_with_roots (i.e., replace
the raw await? with .await.context(format!("failed to create output file: {}",
output))? and keep the rest of the call unchanged).

In `@src/dev/subcommands/mod.rs`:
- Line 53: Add a doc comment above the enum variant
ExportTipsetLookup(export_tipset_lookup_cmd::ExportTipsetLookupCommand) to match
the style used by other variants (e.g., UpdateCheckpoints, ArchiveMissing);
ensure the comment briefly describes the purpose of ExportTipsetLookup and uses
the same comment format and tone as the surrounding variants for consistency.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2a756ab0-ab53-47e1-9798-c5fe85a96c78

📥 Commits

Reviewing files that changed from the base of the PR and between e075a77 and 3ea270a.

📒 Files selected for processing (3)
  • src/db/memory.rs
  • src/dev/subcommands/export_tipset_lookup_cmd.rs
  • src/dev/subcommands/mod.rs

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 10.44776% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.80%. Comparing base (e075a77) to head (a52825d).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/dev/subcommands/export_tipset_lookup_cmd.rs 0.00% 56 Missing ⚠️
src/db/memory.rs 70.00% 3 Missing ⚠️
src/dev/subcommands/mod.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/dev/subcommands/mod.rs 23.88% <0.00%> (-0.37%) ⬇️
src/db/memory.rs 88.81% <70.00%> (-1.42%) ⬇️
src/dev/subcommands/export_tipset_lookup_cmd.rs 0.00% <0.00%> (ø)

... and 8 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e075a77...a52825d. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/dev/subcommands/export_tipset_lookup_cmd.rs (1)

33-38: Consider validating that from >= to to prevent silent empty exports.

Since the chain iteration proceeds from higher epochs (head) down to lower epochs (genesis), from must be ≥ to for the range to make sense. If a user mistakenly provides --from 100 --to 200, the loop will break almost immediately (when it reaches an epoch < 200), producing an empty or near-empty AMT with no warning.

💡 Suggested validation at the start of run()
         let skip_length = skip_length.get() as i64;
+        if let (Some(from), Some(to)) = (from, to) {
+            anyhow::ensure!(
+                from >= to,
+                "--from ({from}) must be >= --to ({to}) since epochs are processed in descending order"
+            );
+        }
         let db_root_path = if let Some(db) = db {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/dev/subcommands/export_tipset_lookup_cmd.rs` around lines 33 - 38,
Validate the epoch range at the start of ExportTipsetLookupCmd::run(): if both
self.from and self.to are Some, check that self.from.unwrap() >=
self.to.unwrap(); if not, return an error (or print a clear message and exit) so
the command fails fast instead of producing a silent empty export — update the
run() entry validation to reference the struct fields `from` and `to` (both
Option<ChainEpoch>) and perform the comparison before any chain iteration
begins.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/dev/subcommands/export_tipset_lookup_cmd.rs`:
- Around line 33-38: Validate the epoch range at the start of
ExportTipsetLookupCmd::run(): if both self.from and self.to are Some, check that
self.from.unwrap() >= self.to.unwrap(); if not, return an error (or print a
clear message and exit) so the command fails fast instead of producing a silent
empty export — update the run() entry validation to reference the struct fields
`from` and `to` (both Option<ChainEpoch>) and perform the comparison before any
chain iteration begins.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ef696737-c7b6-49a1-af62-d7c211932976

📥 Commits

Reviewing files that changed from the base of the PR and between 3ea270a and a52825d.

📒 Files selected for processing (1)
  • src/dev/subcommands/export_tipset_lookup_cmd.rs

Comment on lines +36 to +38
/// End epoch (inclusive). Defaults to 0 (genesis)
#[arg(long)]
to: Option<ChainEpoch>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Option if we have a sane default?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the mapping in case of null tipsets?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants