Skip to content

Changelog

Behnam Ebrahimi edited this page Mar 29, 2026 · 1 revision

Changelog

v1.0.0

Initial public release.

Features

  • Batched decoding — 3-5x faster transcription via parallel segment processing on Apple Silicon
  • LightningWhisperMLX — simple Python API for common use cases
  • Full transcribe() API — all Whisper options exposed
  • CLI (vayu command) — full-featured command-line interface with 30+ options
  • All Whisper models — tiny through large-v3, turbo, and distil variants
  • Quantization — 4-bit and 8-bit quantized models for reduced memory
  • Word-level timestamps — via cross-attention + DTW alignment
  • 5 output formats — txt, srt, vtt, tsv, json
  • 99 languages — full multilingual support with auto-detection
  • Translation — translate any language to English
  • Speculative decoding (experimental) — additional 2-3x speedup with draft/target model pairs
  • VAD processing — voice activity detection for silence skipping
  • Parallel chunk transcription — process long audio with overlapping chunks
  • Security — model path validation to prevent path traversal attacks
  • Temperature fallback — automatic quality recovery via temperature escalation
  • Subtitle formatting — word highlighting, line width/count controls for SRT/VTT
  • Batch file processing — transcribe multiple files with error tolerance
  • stdin support — pipe audio from other processes

Requirements

  • macOS with Apple Silicon (M1/M2/M3/M4)
  • Python 3.10+
  • MLX 0.11+

For the latest changes, see the GitHub Releases page.

Clone this wiki locally