Skip to content

chore(deps-dev): Bump torchao from 0.11 to 0.16.0#197

Open
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/torchao-0.16.0
Open

chore(deps-dev): Bump torchao from 0.11 to 0.16.0#197
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/torchao-0.16.0

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Feb 11, 2026

Bumps torchao from 0.11 to 0.16.0.

Release notes

Sourced from torchao's releases.

v0.16.0

Highlights

We are excited to announce the 0.16.0 release of torchao! This release adds support for MXFP8 MoE Building Blocks for Training with Expert Parallelism and deprecated older versions of some configs and less used quantization options to keep torchao leaner! We also revamped our doc page, README and made some progress in making torchao ABI stable.

MXFP8 MoE Building Blocks for Training with Expert Parallelism

This release includes the following differentiable building blocks for MXFP8 MoE Training with expert parallelism:

  • a2a_dispatch_mxfp8_fwd_hp_bwd: All-to-all token dispatch (MXFP8 forward pass, BF16 backward pass)
  • permute_mxfp8_fwd_hp_bwd: Permute and pad tokens for MXFP8 computation (MXFP8 forward pass, BF16 backward pass)
  • _to_mxfp8_then_scaled_grouped_mm: MXFP8 grouped GEMM for routed expert computation (new: optionally accepts pre-quantized inputs). Produces bfloat16 output.
  • unpermute_hp_fwd_mxfp8_bwd: Unpermute tokens back to original order (BF16 forward pass, MXFP8 backward pass)
  • a2a_combine_hp_fwd_mxfp8_bwd: All-to-all token combine (BF16 forward pass, MXFP8 backward pass). Note the actual combine/aggregation op does not happen here, the naming is just to indicate it is intended to be used for the all2all immediatley preceding the aggregation.

These autograd functions can be chained together to implement efficient MoE training with expert parallel comms and grouped GEMMs in MXFP8.

This approach achieves 10% - 25% tokens/second speedup for DeepSeekV3 16b training:

  • +10% tokens/second on single node 8xB200 with NVLink intra-node networking for inter-device communication.
  • +25% tokens/second on multi-node B200 cluster with IB inter-node networking and NVLink intra-node networking.

Deprecations

# v0.15.0 - version 1 was available.
config = Float8WeightOnlyConfig(version=1, ...)
config = Float8DynamicActivationFloat8WeightConfig(version=1, ...)
config = Int8DynamicActivationIntxWeightConfig(version=1, ...)
config = IntxWeightOnlyConfig(version=1, ...)
config = Int4WeightOnlyConfig(version=1, ...)
v0.16.0 - use version 2 (default). Using version 1 is no longer supported.
config = Float8WeightOnlyConfig(version=2, ...)
config = Float8DynamicActivationFloat8WeightConfig(version=2, ...)
config = Int8DynamicActivationIntxWeightConfig(version=2, ...)
config = IntxWeightOnlyConfig(version=2, ...)
config = Int4WeightOnlyConfig(version=2, ...)

  • Move Int8DynamicActivationInt4WeightConfig, Int4DynamicActivationInt4WeightConfig, GemliteUIntXWeightOnlyConfig, Float8StaticActivationFloat8WeightConfig, UIntXWeightOnlyConfig, FPXWeightOnlyConfig to prototype (pytorch/ao#3491)
# v0.15.0
from torchao.quantization import (
    Int8DynamicActivationInt4WeightConfig, 
    Int4DynamicActivationInt4WeightConfig, 
    GemliteUIntXWeightOnlyConfig, 
</tr></table> 

... (truncated)

Commits
  • 3c1065c test-infra branch ref changes for wheels
  • 79c8d01 rename MXTensor params (#3831)
  • ae17a4d docs: make non-prototype inference wofklow configs have docblocks (#3821)
  • 026b76d Add HQQ option in UIntxWeightOnlyConfig (#3829)
  • e3d045d Update compatability table for 0.16.0 release (#3828)
  • 1b4b6d9 [metal] Fix meta registration of lowbit ops (#3824)
  • ee50805 Revert "[Reland][X86] Re-enable some inductor-related test cases" (#3820)
  • 481e6bb [mxfp8 moe training] add custom sharding for triton dim0 quant kernel (#3812)
  • 3d45dfe [mxfp8 moe training] add e2e training test to ci (#3789)
  • ed91df5 fix torchao circular imports for BUCK (#3816)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [torchao](https://github.com/pytorch/ao) from 0.11 to 0.16.0.
- [Release notes](https://github.com/pytorch/ao/releases)
- [Commits](pytorch/ao@v0.11.0...v0.16.0)

---
updated-dependencies:
- dependency-name: torchao
  dependency-version: 0.16.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Feb 11, 2026
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants