Changelog

Jump to bottom

Behnam Ebrahimi edited this page Mar 29, 2026 · 1 revision

Changelog

v1.0.0

Initial public release.

Features

Batched decoding — 3-5x faster transcription via parallel segment processing on Apple Silicon
LightningWhisperMLX — simple Python API for common use cases
Full transcribe() API — all Whisper options exposed
CLI (vayu command) — full-featured command-line interface with 30+ options
All Whisper models — tiny through large-v3, turbo, and distil variants
Quantization — 4-bit and 8-bit quantized models for reduced memory
Word-level timestamps — via cross-attention + DTW alignment
5 output formats — txt, srt, vtt, tsv, json
99 languages — full multilingual support with auto-detection
Translation — translate any language to English
Speculative decoding (experimental) — additional 2-3x speedup with draft/target model pairs
VAD processing — voice activity detection for silence skipping
Parallel chunk transcription — process long audio with overlapping chunks
Security — model path validation to prevent path traversal attacks
Temperature fallback — automatic quality recovery via temperature escalation
Subtitle formatting — word highlighting, line width/count controls for SRT/VTT
Batch file processing — transcribe multiple files with error tolerance
stdin support — pipe audio from other processes

Requirements

macOS with Apple Silicon (M1/M2/M3/M4)
Python 3.10+
MLX 0.11+

For the latest changes, see the GitHub Releases page.