chore(deps): bump transformers from 5.9.0 to 5.10.1 by dependabot[bot] · Pull Request #177 · VectorInstitute/agent-bootcamp

dependabot · 2026-06-04T04:31:15Z

Bumps transformers from 5.9.0 to 5.10.1.

Release notes

Release v5.10.1

v5.10.0 was yanked as we publish on a corrupted branch. Sorry everyone, this happens when we rush a release!!!

New Model additions

Gemma4 unified+ Gemma4 MTP

Gemma 4 12B Unified is an encoder-free multimodal model with pretrained and instruction-tuned variants. Unlike standard Gemma 4, which uses dedicated encoder towers, Gemma 4 12B Unified projects raw inputs directly into the language model's embedding space through lightweight linear pipelines. This results in a simpler architecture while maintaining strong multimodal performance.

Key differences from standard Gemma 4:

No Vision Tower: Raw pixel patches are projected directly into LM space via a Dense + LayerNorm pipeline with factorized 2D positional embeddings, replacing the vision encoder.

No Audio Tower: Raw 16 kHz waveform samples are chunked into fixed-length frames and projected through a simple RMSNorm → Linear pipeline, replacing the mel spectrogram + Conformer encoder.

Shared Multimodal Pipeline: Both vision and audio use the same Gemma4UnifiedMultimodalEmbedder (RMSNorm → Linear) for the final projection to text hidden space.

You can find the original Gemma 4 12B Unified checkpoints under the Gemma 4 release.

who needs encoders? (#46385) by @douglas-reid @sgerrard @vasqu @molbap

Sapiens2

Sapiens2 is a family of high-resolution vision transformers pretrained on ~1 billion curated human images, designed for human-centric computer vision tasks including pose estimation, body-part segmentation, surface normal estimation, and pointmap estimation. The models scale from 0.4B to 5B parameters and train at native 1K resolution, with hierarchical 4K variants for extended spatial reasoning. Sapiens2 achieves substantial improvements over its predecessor with +4 mAP in pose estimation, +24.3 mIoU in body-part segmentation, and 45.6% error reduction in normal estimation.

Links: Documentation | Paper

Add Sapiens2 Model (#45919) by @guarin in #45919

DeepSeek-OCR-2

DeepSeek-OCR-2 is an OCR-specialized vision-language model built on a distinctive architecture that combines a SAM ViT-B vision encoder with a Qwen2 hybrid attention encoder, connected through an MLP projector to a DeepSeek-V2 Mixture-of-Experts (MoE) language model. The model features a hybrid attention mechanism that applies bidirectional attention over image tokens and causal attention over query tokens, enabling efficient and accurate document understanding. It supports both plain OCR tasks and grounding capabilities with coordinate-aware output for document conversion to markdown format.

Links: Documentation

Add Deepseek-OCR-2 model (#45075) by @thisisiron in #45075

Mellum

Mellum is a code-focused Mixture-of-Experts language model developed by JetBrains. It is derived from the Qwen3-MoE architecture with per-layer-type RoPE and interleaved sliding window attention. The model has 12B total parameters with 2.5B active parameters per token, using 64 routed experts with 8 activated per token across 28 layers.

Links: Documentation

feat: Add support for JetBrains' Mellum v2 code generation model (#46112) by @shadeMe in #46112

Breaking changes

The Gemma4 vision pooler now casts inputs to float32 before scaling to prevent float16 overflow (inf saturation) with large checkpoints, which may cause minor numerical differences in outputs for users running Gemma-4 vision models in float16.

🚨 Fix float16 overflow in Gemma4 vision pooler (#46277) by @Bluear7878

Audio Language Models (ALMs) now have a dedicated base model class without a language modeling head, aligning them with the design of Vision Language Models (VLMs); users relying on the previous model class structure should update their code to use the new base model class where appropriate.

🚨 [ALM] Add base model without head (#45534) by @eustlb

... (truncated)

Commits

90c3ae5 Patch because we had to yank 5.10 because the release branch was not up to date
0bd94b3 v5.10.0
1423d22 who needs encoders? (#46385)
50eb20a Fix dsv4 dequant + tp/ep (#46378)
74464e8 Fix wrong changes produced by style/repo. check bot (#46371)
1b8ec34 Fix path traversal when saving Bark voice preset embeddings (#46237)
e820678 Add Sapiens2 Model (#45919)
595721c Pass library_name/version to Hub calls via a shared HfApi (#46318)
0f0036c docs: update ACL Anthology URL in CITATION.cff (#46352)
fa6c830 DeepGEMM BF16 + mixed FP8/FP4 + MegaMoE + refactor (#45634)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [transformers](https://github.com/huggingface/transformers) from 5.9.0 to 5.10.1. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v5.9.0...v5.10.1) --- updated-dependencies: - dependency-name: transformers dependency-version: 5.10.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>

dependabot Bot added dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code labels Jun 4, 2026

amrit110 merged commit 72be592 into main Jun 5, 2026
5 checks passed

amrit110 deleted the dependabot/uv/transformers-5.10.1 branch June 5, 2026 01:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps): bump transformers from 5.9.0 to 5.10.1#177

chore(deps): bump transformers from 5.9.0 to 5.10.1#177
amrit110 merged 1 commit into
mainfrom
dependabot/uv/transformers-5.10.1

dependabot Bot commented on behalf of github Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dependabot Bot commented on behalf of github Jun 4, 2026

Release v5.10.1

New Model additions

Gemma4 unified+ Gemma4 MTP

Sapiens2

DeepSeek-OCR-2

Mellum

Breaking changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant