Finetuner performance + robustness optimizations by r-kellerm · Pull Request #11 · NVIDIA/GFMBench-api

r-kellerm · 2026-06-11T12:38:28Z

AMP for the full fine-tune path - added use_amp/amp_dtype args plus _is_cuda_device/_amp_available/_pick_amp_dtype helpers and a _autocast() context. Mixed precision is gated to amp_active = use_amp and not only_proj_layer, so the cached linear-probe path stays fp32 (avoids the bf16 to numpy cache crash). GradScaler is wired only for fp16; bf16 is preferred when supported and needs no scaling.
zero_grad(set_to_none=True) - replaced plain zero_grad().
Backbone freeze - on the probe path, backbone params are set requires_grad_(False) once and restored in a finally, replacing the per-batch no_grad()+detach().
Robustness - avg_loss initialized to nan (no NameError on empty loader), and fwd_cache.clear() moved into the finally so the cache is released even on exceptions.

…_to_none=True) 3. Backbone freeze when only projection head is trained; + robustness changes - avg_loss initialized to nan and fwd_cache.clear() moved to finally block to run even in case of exceptions,

Performance opt: 1. AMP for the full fine-tune path, 2. zero_grad(set…

f4e1577

…_to_none=True) 3. Backbone freeze when only projection head is trained; + robustness changes - avg_loss initialized to nan and fwd_cache.clear() moved to finally block to run even in case of exceptions,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetuner performance + robustness optimizations#11

Finetuner performance + robustness optimizations#11
r-kellerm wants to merge 1 commit into
NVIDIA:mainfrom
r-kellerm:main

r-kellerm commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

r-kellerm commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant