Skip to content

feature: safetensors and merkle hashing implemented#91

Open
ryoari wants to merge 3 commits into
mainfrom
safetensors
Open

feature: safetensors and merkle hashing implemented#91
ryoari wants to merge 3 commits into
mainfrom
safetensors

Conversation

@ryoari

@ryoari ryoari commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Extends #89

TLDR;

Implemented stable model artifact verification with safetensors and Merkle chunking.

Why this matters:

Moves the pipeline from reproducibility (same inputs, same outputs) to attestation (a compact, independently checkable proof that an artifact is what we claim). This is the Merkle sealing primitive in working form.

safetensors is the load-bearing choice: unlike pickle .pt, it's byte-canonical, so the artifact hash and Merkle root are stable across machines and runs. We keep .pt for RNG/optimizer replay, separating what we seal from what we resume.

Merkle chunking over a flat hash gives localized tamper attribution, inclusion proofs, and partial verification, scaling toward sharded checkpoints.

Canonical JSON plus recorded root/chunk-size/chunk-count makes the manifest a self-contained root of trust a verifier can check independently.

Consolidating hashing into artifacts.py ensures producer and verifier hash identically; tamper-fail tests confirm the harness catches changes, not just passes.

This layer assumes deterministic bytes; bitwise GPU repro produces them, this PR verifies them.

Changes:

  • Added src/artifacts.py for SHA-256 hashing, canonical JSON hashing, model parameter hashing, safetensors save/load, Merkle manifests, and Merkle proof verification.
  • Updated training to emit mid_checkpoint.safetensors plus mid_checkpoint.merkle.json, while keeping .pt checkpoints for RNG/optimizer replay.
  • Updated eval to prefer safetensors, with .pt fallback.
  • Updated global manifest to record model artifact hash, Merkle root, chunk size, and chunk count.
  • Added safetensors dependency and Merkle unit tests.
  • Removed duplicated hashing logic and stale imports.
  • Replaced Unicode runtime output with ASCII-safe text for Windows compatibility.
  • Ignored generated smoke-test artifacts.

Validation:

  • Full training smoke passed.
  • Clean audit passed; tampered scenarios failed as expected.
  • Eval loaded safetensors successfully.
  • Global manifest generated with Merkle metadata.
  • Unit tests, compile checks, and git diff --check passed.

AI Usage Disclosure:

  • This PR does not contain AI-generated code at all.
  • This PR contains AI-generated code. I have read the AI Usage Policy and this PR complies with this policy. I have tested the code locally and I am responsible for it.

I have used the following AI models and tools:

Used codex & coderabbit to add tests and perform validation as a pair programmer

Checklist

  • My PR addresses a single issue, fixes a single bug or makes a single improvement.
  • My code follows the project's code style and conventions
  • If applicable, I have made corresponding changes or additions to the documentation
  • If applicable, I have made corresponding changes or additions to tests
  • My changes generate no new warnings or errors
  • I have joined the Discord server and I will share a link to this PR with the project maintainers there
  • I have read the Contribution Guidelines
  • Once I submit my PR, CodeRabbit AI will automatically review it and I will address CodeRabbit's comments.
  • I have filled this PR template completely and carefully, and I understand that my PR may be closed without review otherwise.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Merkle tree–based integrity verification for model checkpoint artifacts.
    • Integrated safetensors-based checkpoint save/load for stable, portable model serialization.
  • Improvements

    • Enhanced checkpoint handling with safetensors-first behavior and fallback loading.
    • Expanded verification/global manifests with checkpoint provenance and Merkle metadata.
    • Updated determinism and manifest output formatting for clearer results.
  • Tests

    • Added a Merkle artifact test suite covering manifest generation and proof verification (including empty-file cases).
  • Chores

    • Updated ignored build/checkpoint artifacts and added the safetensors dependency.

@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds shared artifact hashing and Merkle helpers, introduces safetensors checkpoint save/load support, updates training and manifest generation to record checkpoint artifact metadata, switches evaluation and telemetry to shared hashing utilities, and adds Merkle tests plus related dependency, ignore, and output-format updates.

Changes

Artifact checkpoint and verification flow

Layer / File(s) Summary
Artifact utilities and validation
src/artifacts.py, tests/test_artifacts.py, requirements.txt, .gitignore
Adds deterministic JSON/file/model hashing via hash_json and compute_sha256, Merkle manifest/proof generation and verification helpers, and safetensors checkpoint save/load functions, then validates Merkle behavior in three test cases and ignores generated artifact files.
Checkpoint artifact production and metadata
src/reproducibility.py
Training delegates model hashing to shared utilities, writes mid_checkpoint.safetensors and mid_checkpoint.merkle.json at checkpoint steps, and stores related integrity fields (model hash, safetensors SHA-256, Merkle root, chunk size) in mid_checkpoint.pt.
Global manifest generation with Merkle metadata
src/global_manifest.py
Manifest generation imports shared hashing, extends checkpoint fields to include artifact identifier and Merkle verification metadata (Merkle root, chunk size, chunk count), and reformats sealed output messages.
Checkpoint consumers and output alignment
src/eval.py, src/telemetry.py, src/gpu_reproducibility_test.py, src/reproducibility.py
Evaluation and telemetry reuse shared hashing utilities, evaluation records the checkpoint source in its manifest, gpu reproducibility test updates docstring and output formatting, and reproducibility verification scripts reformat mismatch messages and PASS/FAIL labels to bracketed notation.

Sequence Diagram(s)

sequenceDiagram
  participant Training as reproducibility.py
  participant Artifacts as artifacts module
  participant Weights as mid_checkpoint.safetensors
  participant Merkle as mid_checkpoint.merkle.json
  participant Checkpoint as mid_checkpoint.pt
  participant Global as global_manifest.py
  participant Eval as eval.py

  Training->>Artifacts: save_model_safetensors(model)
  Artifacts->>Weights: write weights state
  Training->>Artifacts: write_merkle_manifest(weights)
  Artifacts->>Merkle: write chunk hashes and merkle_root
  Training->>Checkpoint: store hash, merkle_root, chunk_size
  Global->>Checkpoint: read checkpoint metadata
  Global->>Global: record artifact_id, merkle_root, chunk_size, chunk_count
  Eval->>Artifacts: load_model_safetensors(model)
  alt safetensors missing
    Eval->>Checkpoint: torch.load fallback
  end
  Eval->>Eval: record model_checkpoint_source in manifest
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • AOSSIE-Org/OpenVerifiableLLM#89: Touches the same hashing, checkpoint, evaluation, telemetry, and manifest modules that this PR extends with shared artifact and Merkle logic.

Suggested labels

enhancement, backend, python, size/XL

Suggested reviewers

  • Archit381

Poem

🐇 I packed some hashes in a leafy tree,
and tucked safe tensors neatly next to me.
A manifest now hums with roots and proof,
while checkpoints hop with sturdier hoof.
Through eval fields I bound with cheer—
little artifacts all verified here.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.70% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main additions: safetensors integration and Merkle hashing implementation, both of which are central to this PR's objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch safetensors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/artifacts.py`:
- Around line 82-101: In build_merkle_manifest, the chunk list is built from one
read pass but size_bytes and the file-level sha256 are computed afterward, so
the manifest can mix metadata from different file versions if the artifact
changes mid-read. Update the same read loop in artifacts.py to accumulate the
total size and a running full-file hash while building chunks, then use those
accumulated values for size_bytes and sha256 instead of calling path.stat() and
compute_sha256(file_path=path) after the scan.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ec4fc20-07fc-4b74-aa4e-ada35bd5abeb

📥 Commits

Reviewing files that changed from the base of the PR and between c4689e5 and 1f0643d.

📒 Files selected for processing (9)
  • .gitignore
  • requirements.txt
  • src/artifacts.py
  • src/eval.py
  • src/global_manifest.py
  • src/gpu_reproducibility_test.py
  • src/reproducibility.py
  • src/telemetry.py
  • tests/test_artifacts.py

Comment thread src/artifacts.py Outdated
@ryoari

ryoari commented Jun 13, 2026

Copy link
Copy Markdown
Contributor Author

For @Archit381: Accept all incoming changes, refactored some parts hence the conflict.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@ryoari

ryoari commented Jun 14, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Archit381

Archit381 commented Jun 16, 2026

Copy link
Copy Markdown
Member

Please make the following changes

  • Resolve conflicts
  • Work on description to answer the Why? part of this PR and what this PR enables us moving forward.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/reproducibility.py (1)

13-20: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Missing hashlib import causes runtime NameError.

The code uses hashlib.sha256() at multiple locations (lines 54, 137, 170, 212, 246), but hashlib is never imported. This will crash at runtime when any checkpoint loading function is called.

🐛 Proposed fix: Add hashlib import
 import torch
 import torch.nn.functional as F
+import hashlib
 import json
 import math
 import random
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/reproducibility.py` around lines 13 - 20, The code uses hashlib.sha256()
at multiple locations but the hashlib module is never imported, causing a
runtime NameError. Add the hashlib import statement to the imports section at
the top of the file, alongside the other standard library imports, before the
artifacts import block shown in the diff.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@requirements.txt`:
- Line 1: The torch dependency in requirements.txt is pinned to version 2.10.0,
which is affected by two known vulnerabilities (GHSA-rrmf-rvhw-rf47 and
PYSEC-2026-139). Update the torch version specification from 2.10.0 to a version
above 2.12.0 to address both security advisories. This is a simple version
update in the requirements.txt file where torch is declared.
- Around line 1-3: Add the `safetensors` package to requirements.txt to satisfy
the runtime dependency. The codebase imports and uses safetensors.torch in
src/artifacts.py (specifically in the functions called from
src/reproducibility.py and src/eval.py for checkpoint operations), but the
package is missing from the requirements file. Add an entry for safetensors with
an appropriate version constraint to ensure model checkpoints can be written and
loaded successfully in clean environments.

In `@src/eval.py`:
- Around line 34-43: The variable checkpoint_source is referenced later in the
code but is never initialized in the checkpoint loading block. When loading the
checkpoint from mid_checkpoint.pt using torch.load, explicitly assign
checkpoint_source to a string value that identifies the artifact source (for
example, set it to "mid_checkpoint" or similar). This assignment must occur in
the checkpoint loading block so that checkpoint_source is defined before it is
used in manifest creation at line 73.

In `@src/reproducibility.py`:
- Around line 52-58: The file_hash variable computed in the checkpoint loading
section is never used for validation, logging, or comparison, making it dead
code. Either remove the file_hash computation entirely, or integrate it into a
verification workflow (such as comparing against a stored hash, logging it for
audit purposes, or validating tamper detection). Apply the same fix consistently
across all occurrences in the bad_seed_auditor, secret_noise_auditor,
sabotage_auditor, and broken_seal_auditor functions to ensure the code is
complete and the security measure is either properly implemented or cleanly
removed.

In `@src/telemetry.py`:
- Around line 1-4: The TelemetryLogger.hash_model method calls the undefined
symbol model_parameters_sha256, which is not imported at the top of the file,
causing a NameError at runtime. Add model_parameters_sha256 to the import
statements in the imports section (lines 1-4) so that it is available when
TelemetryLogger.hash_model attempts to use it at lines 36-37.

---

Outside diff comments:
In `@src/reproducibility.py`:
- Around line 13-20: The code uses hashlib.sha256() at multiple locations but
the hashlib module is never imported, causing a runtime NameError. Add the
hashlib import statement to the imports section at the top of the file,
alongside the other standard library imports, before the artifacts import block
shown in the diff.
🪄 Autofix (Beta)

❌ Autofix failed (check again to retry)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 28854447-0505-4a5c-a49d-13ed262af34f

📥 Commits

Reviewing files that changed from the base of the PR and between 25e6188 and 94c9be9.

📒 Files selected for processing (5)
  • requirements.txt
  • src/eval.py
  • src/global_manifest.py
  • src/reproducibility.py
  • src/telemetry.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/global_manifest.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/reproducibility.py (1)

13-20: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Missing hashlib import causes runtime NameError.

The code uses hashlib.sha256() at multiple locations (lines 54, 137, 170, 212, 246), but hashlib is never imported. This will crash at runtime when any checkpoint loading function is called.

🐛 Proposed fix: Add hashlib import
 import torch
 import torch.nn.functional as F
+import hashlib
 import json
 import math
 import random
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/reproducibility.py` around lines 13 - 20, The code uses hashlib.sha256()
at multiple locations but the hashlib module is never imported, causing a
runtime NameError. Add the hashlib import statement to the imports section at
the top of the file, alongside the other standard library imports, before the
artifacts import block shown in the diff.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@requirements.txt`:
- Line 1: The torch dependency in requirements.txt is pinned to version 2.10.0,
which is affected by two known vulnerabilities (GHSA-rrmf-rvhw-rf47 and
PYSEC-2026-139). Update the torch version specification from 2.10.0 to a version
above 2.12.0 to address both security advisories. This is a simple version
update in the requirements.txt file where torch is declared.
- Around line 1-3: Add the `safetensors` package to requirements.txt to satisfy
the runtime dependency. The codebase imports and uses safetensors.torch in
src/artifacts.py (specifically in the functions called from
src/reproducibility.py and src/eval.py for checkpoint operations), but the
package is missing from the requirements file. Add an entry for safetensors with
an appropriate version constraint to ensure model checkpoints can be written and
loaded successfully in clean environments.

In `@src/eval.py`:
- Around line 34-43: The variable checkpoint_source is referenced later in the
code but is never initialized in the checkpoint loading block. When loading the
checkpoint from mid_checkpoint.pt using torch.load, explicitly assign
checkpoint_source to a string value that identifies the artifact source (for
example, set it to "mid_checkpoint" or similar). This assignment must occur in
the checkpoint loading block so that checkpoint_source is defined before it is
used in manifest creation at line 73.

In `@src/reproducibility.py`:
- Around line 52-58: The file_hash variable computed in the checkpoint loading
section is never used for validation, logging, or comparison, making it dead
code. Either remove the file_hash computation entirely, or integrate it into a
verification workflow (such as comparing against a stored hash, logging it for
audit purposes, or validating tamper detection). Apply the same fix consistently
across all occurrences in the bad_seed_auditor, secret_noise_auditor,
sabotage_auditor, and broken_seal_auditor functions to ensure the code is
complete and the security measure is either properly implemented or cleanly
removed.

In `@src/telemetry.py`:
- Around line 1-4: The TelemetryLogger.hash_model method calls the undefined
symbol model_parameters_sha256, which is not imported at the top of the file,
causing a NameError at runtime. Add model_parameters_sha256 to the import
statements in the imports section (lines 1-4) so that it is available when
TelemetryLogger.hash_model attempts to use it at lines 36-37.

---

Outside diff comments:
In `@src/reproducibility.py`:
- Around line 13-20: The code uses hashlib.sha256() at multiple locations but
the hashlib module is never imported, causing a runtime NameError. Add the
hashlib import statement to the imports section at the top of the file,
alongside the other standard library imports, before the artifacts import block
shown in the diff.
🪄 Autofix (Beta)

❌ Autofix failed (check again to retry)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 28854447-0505-4a5c-a49d-13ed262af34f

📥 Commits

Reviewing files that changed from the base of the PR and between 25e6188 and 94c9be9.

📒 Files selected for processing (5)
  • requirements.txt
  • src/eval.py
  • src/global_manifest.py
  • src/reproducibility.py
  • src/telemetry.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/global_manifest.py
🛑 Comments failed to post (5)
requirements.txt (2)

1-1: ⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Query OSV for known vulnerabilities affecting torch 2.10.0
curl -s https://api.osv.dev/v1/query \
  -H 'Content-Type: application/json' \
  -d '{"version":"2.10.0","package":{"name":"torch","ecosystem":"PyPI"}}' | jq .
# Expected: advisories list confirms whether this exact version is affected and patched ranges.

Repository: AOSSIE-Org/OpenVerifiableLLM

Length of output: 7025


Update torch to address known vulnerabilities in version 2.10.0.

torch==2.10.0 is affected by at least two known vulnerabilities:

  • GHSA-rrmf-rvhw-rf47 (CVE-2025-3000): Memory corruption in torch.jit.script (low severity, affects versions up to 2.12.0)
  • PYSEC-2026-139 (CVE-2026-4538): Deserialization vulnerability in pt2 Loading Handler (affects 2.10.0 specifically)

Upgrade to a version above 2.12.0 to address both advisories.

🧰 Tools
🪛 OSV Scanner (2.3.8)

[HIGH] 1-1: torch 2.10.0: undefined

(PYSEC-2026-139)


[HIGH] 1-1: torch 2.10.0: PyTorch is vulnerable to memory corruption through its torch.jit.script function

(GHSA-rrmf-rvhw-rf47)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@requirements.txt` at line 1, The torch dependency in requirements.txt is
pinned to version 2.10.0, which is affected by two known vulnerabilities
(GHSA-rrmf-rvhw-rf47 and PYSEC-2026-139). Update the torch version specification
from 2.10.0 to a version above 2.12.0 to address both security advisories. This
is a simple version update in the requirements.txt file where torch is declared.

Source: Linters/SAST tools


1-3: ⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify safetensors runtime usage is backed by requirements.txt
rg -n --type=py '\bsafetensors\b|save_model_safetensors|load_model_safetensors'
rg -n '^safetensors([<>=!~].*)?$' requirements.txt
# Expected: at least one safetensors requirement line in requirements.txt

Repository: AOSSIE-Org/OpenVerifiableLLM

Length of output: 820


Add safetensors to requirements.txt to satisfy the runtime checkpoint contract.

The codebase imports and uses safetensors.torch in src/artifacts.py (lines 204, 227) and calls these functions from src/reproducibility.py and src/eval.py, but there is no safetensors entry in requirements.txt. This will cause runtime failures when writing or loading model checkpoints in clean environments.

Suggested fix
 torch==2.10.0
 numpy==2.4.3
 tqdm==4.67.3
+safetensors>=0.4.5,<1.0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

torch==2.10.0
numpy==2.4.3
tqdm==4.67.3
safetensors>=0.4.5,<1.0
🧰 Tools
🪛 OSV Scanner (2.3.8)

[HIGH] 1-1: torch 2.10.0: undefined

(PYSEC-2026-139)


[HIGH] 1-1: torch 2.10.0: PyTorch is vulnerable to memory corruption through its torch.jit.script function

(GHSA-rrmf-rvhw-rf47)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@requirements.txt` around lines 1 - 3, Add the `safetensors` package to
requirements.txt to satisfy the runtime dependency. The codebase imports and
uses safetensors.torch in src/artifacts.py (specifically in the functions called
from src/reproducibility.py and src/eval.py for checkpoint operations), but the
package is missing from the requirements file. Add an entry for safetensors with
an appropriate version constraint to ensure model checkpoints can be written and
loaded successfully in clean environments.
src/eval.py (1)

34-43: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

checkpoint_source is used before assignment, causing a runtime crash.

At Line 73, checkpoint_source is undefined, so manifest creation fails with NameError. The load block should set it explicitly when choosing the artifact source.

Suggested fix
-    checkpoint_path = "mid_checkpoint.pt"
-    with open(checkpoint_path, "rb") as f:
-        file_hash = hashlib.sha256(f.read()).hexdigest()
-
-    checkpoint = torch.load(checkpoint_path, weights_only=False, map_location=DEVICE)
-    model.load_state_dict(checkpoint['model'])
+    checkpoint = {}
+    if os.path.exists(CHECKPOINT_WEIGHTS_PATH):
+        load_model_safetensors(model, CHECKPOINT_WEIGHTS_PATH, device=DEVICE)
+        checkpoint_source = "safetensors"
+    else:
+        checkpoint_path = "mid_checkpoint.pt"
+        checkpoint = torch.load(checkpoint_path, weights_only=False, map_location=DEVICE)
+        model.load_state_dict(checkpoint["model"])
+        checkpoint_source = "pt"

Also applies to: 70-74

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/eval.py` around lines 34 - 43, The variable checkpoint_source is
referenced later in the code but is never initialized in the checkpoint loading
block. When loading the checkpoint from mid_checkpoint.pt using torch.load,
explicitly assign checkpoint_source to a string value that identifies the
artifact source (for example, set it to "mid_checkpoint" or similar). This
assignment must occur in the checkpoint loading block so that checkpoint_source
is defined before it is used in manifest creation at line 73.
src/reproducibility.py (1)

52-58: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Computed file_hash is never used.

The file-level hash is computed as a "security measure" but is never used for validation, logging, or comparison. This appears to be dead code or incomplete implementation. Either remove it or integrate it into the verification flow.

The same pattern repeats in bad_seed_auditor (line 137), secret_noise_auditor (line 170), sabotage_auditor (line 212), and broken_seal_auditor (line 246).

🧰 Tools
🪛 Ruff (0.15.17)

[error] 54-54: Undefined name hashlib

(F821)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/reproducibility.py` around lines 52 - 58, The file_hash variable computed
in the checkpoint loading section is never used for validation, logging, or
comparison, making it dead code. Either remove the file_hash computation
entirely, or integrate it into a verification workflow (such as comparing
against a stored hash, logging it for audit purposes, or validating tamper
detection). Apply the same fix consistently across all occurrences in the
bad_seed_auditor, secret_noise_auditor, sabotage_auditor, and
broken_seal_auditor functions to ensure the code is complete and the security
measure is either properly implemented or cleanly removed.
src/telemetry.py (1)

1-4: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

TelemetryLogger.hash_model calls an undefined symbol.

model_parameters_sha256 is not imported, so this path raises NameError at runtime.

Suggested fix
 import json
 import torch
 import os
 import hashlib
+from artifacts import model_parameters_sha256

Also applies to: 36-37

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/telemetry.py` around lines 1 - 4, The TelemetryLogger.hash_model method
calls the undefined symbol model_parameters_sha256, which is not imported at the
top of the file, causing a NameError at runtime. Add model_parameters_sha256 to
the import statements in the imports section (lines 1-4) so that it is available
when TelemetryLogger.hash_model attempts to use it at lines 36-37.

@ryoari

ryoari commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

Please make the following changes

* [x]  Resolve conflicts

* [x]  Work on description to answer the `Why?` part of this PR and what this PR enables us moving forward.

Done with both.

@ryoari

ryoari commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai autofix

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found.

@ryoari

ryoari commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

Some of these are cve, to be fixed later.

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found.

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants