Skip to content

Comments

chore: compressed repr. should store disc as part of data#2282

Open
SwenSchaeferjohann wants to merge 3 commits intomainfrom
swen/fix-disc-for-accountinterface
Open

chore: compressed repr. should store disc as part of data#2282
SwenSchaeferjohann wants to merge 3 commits intomainfrom
swen/fix-disc-for-accountinterface

Conversation

@SwenSchaeferjohann
Copy link
Contributor

@SwenSchaeferjohann SwenSchaeferjohann commented Feb 12, 2026

  • indexer has to store the Account data as it is on-chain in the Account field of AccountInterface. this includes the discriminator
  • indexer cannot stitch it together if it doesn't know the length of the discriminator
  • we should support var length discriminators for eg pinocchio programs

Summary by CodeRabbit

  • Bug Fixes

    • Ensured consistent account data construction and hashing so SDK-produced values match on-chain layout and verification.
  • Documentation

    • Clarified that stored account data contains the full on-chain bytes as-is, reducing ambiguity for integrators.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 12, 2026

📝 Walkthrough

Walkthrough

The PR changes how compressed/decompressed account bytes are constructed and hashed: account data is now used directly when available, and SHA-256 inputs are prefixed with an 8-byte on-chain discriminator before hashing. No public API signatures were changed.

Changes

Cohort / File(s) Summary
Account Data Construction
sdk-libs/client/src/interface/account_interface.rs
Cold-path now clones existing on-chain account bytes (data.data) instead of rebuilding bytes by prefixing the discriminator and concatenating fields. Doc comment added clarifying data.data contains full on-chain bytes.
Compressed PDA hashing
sdk-libs/sdk-types/src/interface/program/compression/pda.rs
Hash input now = 8-byte discriminator
Decompressed PDA hashing
sdk-libs/sdk-types/src/interface/program/decompression/pda.rs
Same discriminator-prefix change for hash input: 8-byte discriminator

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

ai-review

Suggested reviewers

  • ananas-block

Poem

🔷 Eight bytes march before the rest,
Hashes shaped to match the nest,
Bytes copied straight from on-chain land,
Small changes guided by a steady hand. 🎯

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: storing the discriminator as part of the account data in the compressed representation, which is the core objective across all three modified files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 70.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch swen/fix-disc-for-accountinterface

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sdk-libs/sdk-types/src/interface/program/compression/pda.rs (1)

142-148: ⚠️ Potential issue | 🔴 Critical

Inconsistency in OutAccountInfo.data format between pda.rs and account.rs paths.

The data field format is inconsistent across code paths. In pda.rs (decompression), OutAccountInfo.data includes the 8-byte discriminator prefix (disc + borsh), with data_hash calculated over both. However, in account.rs (modification via to_account_info()), OutAccountInfo.data contains only the borsh serialization without the discriminator prefix, and data_hash is calculated on the account object directly.

Both paths populate the same OutAccountInfo structure, so this format mismatch will break downstream consumers. Additionally, the data_hash calculations differ between paths (pda.rs hashes disc+borsh, account.rs hashes account), which could cause validation failures. Update account.rs to mirror the pda.rs layout: include the discriminator prefix in the data field and recalculate data_hash consistently.

@SwenSchaeferjohann SwenSchaeferjohann force-pushed the swen/fix-disc-for-accountinterface branch from 6a28d8f to 5bb0505 Compare February 17, 2026 16:33
Comment on lines 97 to 109
// Serialize updated account data back (includes 8-byte discriminator)
{
let mut data = account_info
.try_borrow_mut_data()
.map_err(LightSdkTypesError::AccountError)?;
// Write discriminator first
data[..8].copy_from_slice(&A::LIGHT_DISCRIMINATOR);
// Write serialized account data after discriminator
let writer = &mut &mut data[8..];
account_data
.serialize(writer)
.map_err(|_| LightSdkTypesError::Borsh)?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this exist?

// Hash the data (discriminator NOT included per protocol convention)
let data_bytes = borsh::to_vec(&compressed_data).map_err(|_| LightSdkTypesError::Borsh)?;
// Serialize with disc prefix: disc(8) + borsh(struct) — mirrors on-chain layout.
let borsh_bytes = borsh::to_vec(&compressed_data).map_err(|_| LightSdkTypesError::Borsh)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for a future pr:
I don't like this we are deserializing the complete struct, use CompressionInfo then serialize everything back. I think it would be cleaner to add a trait CompressData that defines what data is copied from the account info to the compressed account.

@ananas-block
Copy link
Contributor

looks ok, we don't have any asserts that check everything compresses and decompresses as expected.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
sdk-libs/client/src/interface/account_interface.rs (2)

68-85: 🛠️ Refactor suggestion | 🟠 Major

Missing round-trip test coverage for the new data layout.

As noted in the review comments, there are no assertions verifying that compress → store → decompress produces correct results with the new disc-prefixed data layout. Given that this change affects hashing (which is security-critical for Merkle tree integrity) and data reconstruction, a round-trip integration test would catch any asymmetry between the compression/decompression paths early.

Consider adding a test that:

  1. Compresses an account (producing disc || borsh data + hash)
  2. Stores it via AccountInterface::cold()
  3. Verifies data() returns the expected disc || borsh bytes
  4. Decompresses and verifies the hash matches
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/interface/account_interface.rs` around lines 68 - 85, Add
a round-trip integration test that verifies the new disc-prefixed layout: create
an Account and compress it to obtain a CompressedAccount (with disc || borsh
bytes and its hash), call AccountInterface::cold(key, compressed, owner), assert
that the resulting AccountInterface.data() (or the stored Account.account.data
field) equals the expected disc || borsh bytes, then decompress the stored
CompressedAccount and assert the decompressed bytes and recomputed hash equal
the original hash; use the CompressedAccount, AccountInterface::cold,
AccountInterface::data (or Account) and the same compression/decompression
helpers used in production so the test catches any asymmetry between compress
and decompress.

100-104: ⚠️ Potential issue | 🔴 Critical

as_mint() will fail to deserialize cold mints — discriminator must be prepended.

The cold() method extracts only compressed.data.as_ref().map(|d| d.data.clone()), which omits the 8-byte discriminator that's stored separately in CompressedAccountData::discriminator. However, as_mint() tries to deserialize this data directly, expecting the discriminator to be present (as shown in decompress_mint.rs, which explicitly prepends it when constructing full account bytes). Cold accounts will fail deserialization; fix by prepending the discriminator in as_mint():

pub fn as_mint(&self) -> Option<light_token_interface::state::Mint> {
    let ca = self.cold.as_ref()?;
    let cad = ca.data.as_ref()?;
    let mut buf = cad.discriminator.to_vec();
    buf.extend_from_slice(&cad.data);
    borsh::BorshDeserialize::deserialize(&mut buf.as_slice()).ok()
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk-libs/client/src/interface/account_interface.rs` around lines 100 - 104,
The as_mint() implementation must prepend the 8-byte discriminator from
CompressedAccountData::discriminator when deserializing cold accounts because
cold() currently returns only compressed.data.as_ref().map(|d| d.data.clone());
update as_mint() to fetch self.cold, get ca.data, create a buffer starting with
cad.discriminator (as bytes), extend it with cad.data, and then call
borsh::BorshDeserialize::deserialize on that combined buffer (returning Option
by mapping ok()); ensure you reference as_mint(), cold(), and
CompressedAccountData::discriminator when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@sdk-libs/client/src/interface/account_interface.rs`:
- Around line 68-85: Add a round-trip integration test that verifies the new
disc-prefixed layout: create an Account and compress it to obtain a
CompressedAccount (with disc || borsh bytes and its hash), call
AccountInterface::cold(key, compressed, owner), assert that the resulting
AccountInterface.data() (or the stored Account.account.data field) equals the
expected disc || borsh bytes, then decompress the stored CompressedAccount and
assert the decompressed bytes and recomputed hash equal the original hash; use
the CompressedAccount, AccountInterface::cold, AccountInterface::data (or
Account) and the same compression/decompression helpers used in production so
the test catches any asymmetry between compress and decompress.
- Around line 100-104: The as_mint() implementation must prepend the 8-byte
discriminator from CompressedAccountData::discriminator when deserializing cold
accounts because cold() currently returns only compressed.data.as_ref().map(|d|
d.data.clone()); update as_mint() to fetch self.cold, get ca.data, create a
buffer starting with cad.discriminator (as bytes), extend it with cad.data, and
then call borsh::BorshDeserialize::deserialize on that combined buffer
(returning Option by mapping ok()); ensure you reference as_mint(), cold(), and
CompressedAccountData::discriminator when making the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants