chore(deps-dev): Bump torchao from 0.11 to 0.16.0 by dependabot[bot] · Pull Request #197 · foundation-model-stack/fms-model-optimizer

dependabot · 2026-02-11T21:54:14Z

Bumps torchao from 0.11 to 0.16.0.

Release notes

v0.16.0

Highlights

We are excited to announce the 0.16.0 release of torchao! This release adds support for MXFP8 MoE Building Blocks for Training with Expert Parallelism and deprecated older versions of some configs and less used quantization options to keep torchao leaner! We also revamped our doc page, README and made some progress in making torchao ABI stable.

MXFP8 MoE Building Blocks for Training with Expert Parallelism

This release includes the following differentiable building blocks for MXFP8 MoE Training with expert parallelism:

a2a_dispatch_mxfp8_fwd_hp_bwd: All-to-all token dispatch (MXFP8 forward pass, BF16 backward pass)

permute_mxfp8_fwd_hp_bwd: Permute and pad tokens for MXFP8 computation (MXFP8 forward pass, BF16 backward pass)

_to_mxfp8_then_scaled_grouped_mm: MXFP8 grouped GEMM for routed expert computation (new: optionally accepts pre-quantized inputs). Produces bfloat16 output.

unpermute_hp_fwd_mxfp8_bwd: Unpermute tokens back to original order (BF16 forward pass, MXFP8 backward pass)

a2a_combine_hp_fwd_mxfp8_bwd: All-to-all token combine (BF16 forward pass, MXFP8 backward pass). Note the actual combine/aggregation op does not happen here, the naming is just to indicate it is intended to be used for the all2all immediatley preceding the aggregation.

These autograd functions can be chained together to implement efficient MoE training with expert parallel comms and grouped GEMMs in MXFP8.

This approach achieves 10% - 25% tokens/second speedup for DeepSeekV3 16b training:

+10% tokens/second on single node 8xB200 with NVLink intra-node networking for inter-device communication.

+25% tokens/second on multi-node B200 cluster with IB inter-node networking and NVLink intra-node networking.

Deprecations

Deprecate v1 of Float8WeightOnlyConfig, Float8DynamicActivationFloat8WeightConfig, Int8DynamicActivationIntxWeightConfig, IntxWeightOnlyConfig, Int4WeightOnlyConfig (pytorch/ao#3510, pytorch/ao#3511, pytorch/ao#3512, pytorch/ao#3513)
# v0.15.0 - version 1 was available.
config = Float8WeightOnlyConfig(version=1, ...)
config = Float8DynamicActivationFloat8WeightConfig(version=1, ...)
config = Int8DynamicActivationIntxWeightConfig(version=1, ...)
config = IntxWeightOnlyConfig(version=1, ...)
config = Int4WeightOnlyConfig(version=1, ...)
v0.16.0 - use version 2 (default). Using version 1 is no longer supported.
config = Float8WeightOnlyConfig(version=2, ...)
config = Float8DynamicActivationFloat8WeightConfig(version=2, ...)
config = Int8DynamicActivationIntxWeightConfig(version=2, ...)
config = IntxWeightOnlyConfig(version=2, ...)
config = Int4WeightOnlyConfig(version=2, ...)
Move Int8DynamicActivationInt4WeightConfig, Int4DynamicActivationInt4WeightConfig, GemliteUIntXWeightOnlyConfig, Float8StaticActivationFloat8WeightConfig, UIntXWeightOnlyConfig, FPXWeightOnlyConfig to prototype (pytorch/ao#3491)
# v0.15.0
from torchao.quantization import (
    Int8DynamicActivationInt4WeightConfig, 
    Int4DynamicActivationInt4WeightConfig, 
    GemliteUIntXWeightOnlyConfig, 
</tr></table> 

... (truncated)

Commits

3c1065c test-infra branch ref changes for wheels
79c8d01 rename MXTensor params (#3831)
ae17a4d docs: make non-prototype inference wofklow configs have docblocks (#3821)
026b76d Add HQQ option in UIntxWeightOnlyConfig (#3829)
e3d045d Update compatability table for 0.16.0 release (#3828)
1b4b6d9 [metal] Fix meta registration of lowbit ops (#3824)
ee50805 Revert "[Reland][X86] Re-enable some inductor-related test cases" (#3820)
481e6bb [mxfp8 moe training] add custom sharding for triton dim0 quant kernel (#3812)
3d45dfe [mxfp8 moe training] add e2e training test to ci (#3789)
ed91df5 fix torchao circular imports for BUCK (#3816)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [torchao](https://github.com/pytorch/ao) from 0.11 to 0.16.0. - [Release notes](https://github.com/pytorch/ao/releases) - [Commits](pytorch/ao@v0.11.0...v0.16.0) --- updated-dependencies: - dependency-name: torchao dependency-version: 0.16.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>

dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Feb 11, 2026

dependabot bot requested review from BrandonGroth, andrea-fasoli, chichun-charlie-liu, kcirred, nwang-ibm and tharapalanivel as code owners February 11, 2026 21:54

dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Feb 11, 2026

dependabot bot mentioned this pull request Feb 11, 2026

chore(deps-dev): Bump torchao from 0.11 to 0.15.0 #193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps-dev): Bump torchao from 0.11 to 0.16.0#197

chore(deps-dev): Bump torchao from 0.11 to 0.16.0#197
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/torchao-0.16.0

dependabot bot commented on behalf of github Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dependabot bot commented on behalf of github Feb 11, 2026

v0.16.0

Highlights

MXFP8 MoE Building Blocks for Training with Expert Parallelism

Deprecations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants