Skip to content

Vendor full tokenized WikiText-2 corpus into core benchmark fixtures#27

Merged
sharpninja merged 9 commits intomainfrom
copilot/add-wikitext-2-validation-set
Mar 21, 2026
Merged

Vendor full tokenized WikiText-2 corpus into core benchmark fixtures#27
sharpninja merged 9 commits intomainfrom
copilot/add-wikitext-2-validation-set

Conversation

Copy link
Contributor

Copilot AI commented Mar 21, 2026

  • Inspect repository dataset handling, script conventions, and documentation touchpoints
  • Run baseline solution build and tests before making changes
  • Add a local script to download/process full TinyLlama and WikiText-2 data into repository paths
  • Document how to run the script and what files it updates
  • Validate the solution build/tests still pass after the change
  • Run final code review and security validation

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits March 21, 2026 17:05
Copilot AI changed the title [WIP] Add WikiText-2 validation set pre-tokenized Vendor full tokenized WikiText-2 corpus into core benchmark fixtures Mar 21, 2026
Copilot AI requested a review from sharpninja March 21, 2026 17:10
@sharpninja sharpninja marked this pull request as ready for review March 21, 2026 17:16
Copilot AI review requested due to automatic review settings March 21, 2026 17:16
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR vendors the full pre-tokenized WikiText-2 corpus into BitNetSharp.Core as embedded resources and updates benchmark fixtures/tests/docs so benchmark/audit paths use repository-local data without external downloads.

Changes:

  • Embed wiki.train/valid/test.tokens into BitNetSharp.Core and load them via BitNetBenchmarkFixtures.
  • Expand fixture surface with split-specific properties (train/valid/test) and keep perplexity wired to validation.
  • Add tests to validate split loading/counts and update benchmarking docs to reflect the new local corpus source.

Reviewed changes

Copilot reviewed 4 out of 7 changed files in this pull request and generated 3 comments.

File Description
tests/BitNetSharp.Tests/HostedAgentBenchmarksExecutionTests.cs Uses a smaller validation subset for perplexity test runtime; adds split-loading assertions for the new fixtures.
src/BitNetSharp.Core/BitNetSharp.Core.csproj Embeds the vendored WikiText-2 token files as resources.
src/BitNetSharp.Core/BitNetBenchmarkFixtures.cs Replaces placeholder WikiText-2 data with lazy-loaded embedded-resource splits.
docs/benchmarking.md Documents that benchmark perplexity now uses the vendored local WikiText-2 validation split.

@sharpninja
Copy link
Owner

@copilot apply changes based on the comments in this thread

Copilot AI and others added 2 commits March 21, 2026 17:23
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Agent-Logs-Url: https://github.com/sharpninja/BitNet-b1.58-Sharp/sessions/cb4f0c8e-263a-48fc-875c-3ccea7631ebe
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Agent-Logs-Url: https://github.com/sharpninja/BitNet-b1.58-Sharp/sessions/cb4f0c8e-263a-48fc-875c-3ccea7631ebe
Copy link
Contributor Author

Copilot AI commented Mar 21, 2026

@copilot apply changes based on the comments in this thread

Addressed in 679d00d. The WikiText-2 loader now preserves blank separator rows line-for-line, the fixture tests use shorter stable assertions instead of full long literals, and the benchmarking docs were updated to match the actual in-memory behavior.

Copilot AI requested a review from sharpninja March 21, 2026 17:26
@sharpninja
Copy link
Owner

@copilot diagnose and fix test failures in build pipeline.

Copilot AI and others added 2 commits March 21, 2026 19:34
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Agent-Logs-Url: https://github.com/sharpninja/BitNet-b1.58-Sharp/sessions/efe42022-8a61-4408-89ff-da2ae67c2259
Copy link
Contributor Author

Copilot AI commented Mar 21, 2026

@copilot diagnose and fix test failures in build pipeline.

Addressed in 527f9df. I traced the pipeline issue to the build.yml Test step hanging after the vendored WikiText-2 corpus made benchmark/audit perplexity tests process very large samples. The fix keeps full-corpus behavior in production, adds a compact repository-local WikiText-2 subset for the affected tests through a shared helper, and the targeted plus full solution test runs now complete locally.

Copilot AI requested a review from sharpninja March 21, 2026 19:44
Copilot AI and others added 2 commits March 21, 2026 20:03
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Agent-Logs-Url: https://github.com/sharpninja/BitNet-b1.58-Sharp/sessions/b27c3cde-6893-4c19-866c-09f79c5acb7c
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Agent-Logs-Url: https://github.com/sharpninja/BitNet-b1.58-Sharp/sessions/b27c3cde-6893-4c19-866c-09f79c5acb7c
@sharpninja sharpninja merged commit e944200 into main Mar 21, 2026
@sharpninja sharpninja deleted the copilot/add-wikitext-2-validation-set branch March 21, 2026 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants