libsonare

A dependency-free audio DSP toolkit for C++, Python, and the browser — librosa-compatible analysis plus broadcast-grade mastering, mixing, and editing.

Apache-2.0, zero runtime dependencies, one codebase for native and WebAssembly. The same processors that run in C++ run in the browser via WASM — no Python, no GPL/AGPL, no model weights.

📖 Documentation · 🎧 Browser-local Demos · Getting Started

Analysis (librosa-compatible) — BPM, key, chord (Viterbi/HMM smoothing, inversions, key-context), beat, downbeat, time signature, section, timbre, dynamics, pitch (YIN / pYIN), tempogram / PLP, NNLS chroma, EBU R128 loudness (LUFS), and room acoustics (blind RT60/EDT, or ISO-style RT60/EDT/C50/C80/D50 from a measured IR). Defaults match librosa and are validated against generated librosa reference values in CI.
Mastering (77 named DSP processors, 14 in the default chain) — EQ, dynamics, multiband, stereo, saturation, repair, maximizer, and reference matching, implemented against published references: ITU-R BS.1770-4 loudness and inter-sample true-peak limiting, Linkwitz-Riley crossovers with all-pass phase compensation, Vicanek matched-Z biquads, ADAA-antialiased clippers, a Dempwolf 12AX7 triode model for tube saturation, Lemire sliding max, and polyphase FIR oversampling. Repair is classical DSP by design (spectral subtraction / MMSE-STSA / LogMMSE), not DNN source separation or spectral repair.
Mixing & routing — a real-time-safe channel-strip / bus model (denormal-guarded, lock-free parameter changes, plugin-delay compensation) with pan modes, width, sends, FX buses, goniometer / true-peak metering, scene presets, and offline stereo rendering.
Editing & creative FX — time stretch / pitch shift, pitch correction, note-region stretch, voice-change pitch + formant controls, four reverb engines (convolution, Dattorro plate, FDN, velvet-noise), chorus / flanger / phaser, stereo delay, and ducking.
Everywhere, one license — Apache-2.0 across the entire stack (C++, C, Python, Node, WASM, and CLI).

Installation

npm install @libraz/libsonare         # JavaScript / TypeScript (WASM, takes Float32Array)
pip install libsonare                  # Python (WAV/MP3 — see "Supported audio formats" for M4A/AAC)

For Node.js with native file decoding, build @libraz/libsonare-native from source:

cd bindings/node
yarn install
yarn build  # auto-detects FFmpeg via pkg-config (WAV/MP3 if absent, +M4A/AAC/FLAC/OGG if present)

To force a specific mode:

SONARE_FFMPEG=0 yarn build  # explicitly disable FFmpeg
SONARE_FFMPEG=1 yarn build  # require FFmpeg (fails if dev libs missing)

Quick Start

JavaScript / TypeScript (WASM)

@libraz/libsonare accepts decoded Float32Array samples — use the Web Audio API or a JS decoder to obtain them. Mastering DSP is included in the default WASM build.

Analysis

import { init, detectBpm, detectKey, analyze } from '@libraz/libsonare';

await init();

const bpm = detectBpm(samples, sampleRate);
const key = detectKey(samples, sampleRate);  // { name: "C major", confidence: 0.95 }
const result = analyze(samples, sampleRate);

// Advanced key options are opt-in; defaults preserve existing behavior.
const keyWithOptions = detectKey(samples, sampleRate, {
  useHpss: true,
  loudnessWeighted: true,
  highPassHz: 80,
  nFft: 4096,
  hopLength: 512,
});

Room acoustics

import { analyzeImpulseResponse, detectAcoustic } from '@libraz/libsonare';

// Ordinary audio: blind RT60/EDT estimate. C50/C80/D50 are NaN in blind mode.
const blind = detectAcoustic(samples, sampleRate);

// Measured impulse response: ISO-style RT60/EDT plus clarity metrics.
const room = analyzeImpulseResponse(irSamples, sampleRate);

Rhythm & chords

import { analyze, detectDownbeats, detectChords } from '@libraz/libsonare';

const downbeats = detectDownbeats(samples, sampleRate);  // bar-start times (s)
const { timeSignature } = analyze(samples, sampleRate);  // { numerator: 4, denominator: 4 }

// Chord detection extras are all opt-in (defaults preserve existing behavior).
const chords = detectChords(samples, sampleRate, {
  useHmm: true,            // Viterbi/HMM temporal smoothing
  detectInversions: true,  // slash chords via detected bass note
  useKeyContext: true,     // bias toward in-key chords
  chromaMethod: 'nnls',    // NNLS chroma instead of plain STFT chroma
});

Tempogram, NNLS chroma & loudness

import {
  onsetEnvelope, tempogram, fourierTempogram, tempogramRatio, plp,
  nnlsChroma, lufs,
} from '@libraz/libsonare';

// Onset strength envelope feeds the tempo-domain features.
const env = onsetEnvelope(samples, sampleRate);
const tg = tempogram(env, sampleRate);          // { winLength, nFrames, data }
const ft = fourierTempogram(env, sampleRate);   // { nBins, nFrames, data }
const ratios = tempogramRatio(tg.data, tg.winLength, sampleRate);
const pulse = plp(env, sampleRate);             // predominant local pulse

const chroma = nnlsChroma(samples, sampleRate); // { nChroma: 12, nFrames, data }

// EBU R128 loudness metering (separate from the mastering loudness target).
const loud = lufs(samples, sampleRate);
// { integratedLufs, momentaryLufs, shortTermLufs, loudnessRange }

Mastering

import {
  init,
  masteringChain,
  masteringChainStereo,
  masteringPairAnalyze,
  masteringPairProcess,
  masteringPairProcessorNames,
  masteringProcess,
  masteringProcessorNames,
} from '@libraz/libsonare';

await init();

const mastered = masteringChain(samples, sampleRate, {
  eq: { tiltDb: 1.0 },
  dynamics: { compressor: { thresholdDb: -24, ratio: 1.5 } },
  saturation: { tape: { driveDb: 1.0, saturation: 0.2 } },
  loudness: { targetLufs: -14, ceilingDb: -1, truePeakOversample: 4 },
});

const stereo = masteringChainStereo(left, right, sampleRate, {
  stereo: { imager: { width: 1.1 }, monoMaker: { amount: 0.2 } },
  loudness: { targetLufs: -14, ceilingDb: -1, truePeakOversample: 4 },
});

// Apply a single named processor
const compressed = masteringProcess('dynamics.compressor', samples, sampleRate, {
  thresholdDb: -24,
  ratio: 1.5,
});

// Reference-based mastering
const matched = masteringPairProcess('match.abCrossfade', source, reference, sampleRate, {
  mix: 0.25,
});
const loudnessJson = masteringPairAnalyze(
  'match.referenceLoudness', source, reference, sampleRate,
);

// Discover available processors
masteringProcessorNames();     // ['dynamics.compressor', 'eq.parametric', ...]
masteringPairProcessorNames(); // ['match.abCrossfade', ...]

Preset-driven mastering and the block-by-block streaming variant are also exposed. WASM uses nested config for masteringChain / StreamingMasteringChain, while masterAudio overrides use flat dot-notation keys (mirroring the Node and Python overrides API):

// Mastering presets (one-shot) and streaming variant
import { masterAudio, masteringPresetNames, StreamingMasteringChain } from '@libraz/libsonare';
masteringPresetNames(); // ['pop', 'edm', 'acoustic', 'hipHop', 'aiMusic', 'speech', 'streaming', 'youtube', 'broadcast', 'podcast', 'audiobook', 'cinema', 'jpop', 'ambient', 'lofi', 'classical']
const out = masterAudio(samples, sampleRate, 'aiMusic', { 'loudness.targetLufs': -13 });

const chain = new StreamingMasteringChain({ eq: { tiltDb: 0.5 } });
chain.prepare(48000, 512, 1);
const block = chain.processMono(new Float32Array(512));
chain.delete();

Mixing

import { mixStereo, mixingScenePresetJson, mixingScenePresetNames } from '@libraz/libsonare';

mixingScenePresetNames(); // ['vocalReverbSend', ...]
const sceneJson = mixingScenePresetJson('vocalReverbSend');

const mix = mixStereo([vocalL, musicL], [vocalR, musicR], sampleRate, {
  faderDb: [-3, -12],
  pan: [0, -0.2],
  width: [1, 0.9],
});
// { left, right, meters }

DAW editing DSP

import { noteStretch, pitchCorrectToMidi, voiceChange } from '@libraz/libsonare';

const corrected = pitchCorrectToMidi(samples, sampleRate, 69, 70);
const stretchedNote = noteStretch(samples, sampleRate, 12000, 24000, 1.25);
const changed = voiceChange(samples, sampleRate, 5, 1.1);

Python

pip install libsonare ships a WAV/MP3-only wheel (matching librosa / pydub / soundfile conventions). For M4A/AAC/FLAC/OGG either pre-convert with external ffmpeg, or rebuild from source with FFmpeg linked:

SONARE_FFMPEG=1 pip install libsonare --no-binary libsonare
# requires system FFmpeg dev libs: brew install ffmpeg / apt install libavformat-dev libavcodec-dev libavutil-dev libswresample-dev

import libsonare

audio = libsonare.Audio.from_file("song.mp3")
print(f"BPM: {audio.detect_bpm()}, Key: {audio.detect_key()}")

# Advanced key options are opt-in; defaults preserve existing behavior.
key_with_options = audio.detect_key(
    use_hpss=True,
    loudness_weighted=True,
    high_pass_hz=80.0,
)

acoustic = audio.detect_acoustic()  # blind RT60/EDT; C50/C80/D50 are NaN
ir_params = libsonare.analyze_impulse_response(ir_samples, sample_rate=sr)

# Downbeats, time signature, and chord extras (all opt-in)
downbeats = audio.detect_downbeats()              # bar-start times (s)
time_signature = audio.analyze().time_signature   # e.g. 4/4
chords = audio.detect_chords(
    use_hmm=True,             # Viterbi/HMM temporal smoothing
    detect_inversions=True,   # slash chords via detected bass note
    use_key_context=True,     # bias toward in-key chords
    chroma_method="nnls",     # NNLS chroma instead of plain STFT chroma
)

# Tempogram / NNLS chroma / EBU R128 loudness
env = audio.onset_envelope()                     # onset strength envelope
n_frames, tg = libsonare.tempogram(env, sample_rate=sr)
n_frames_ft, ft = libsonare.fourier_tempogram(env, sample_rate=sr)
ratios = libsonare.tempogram_ratio(tg)
pulse = libsonare.plp(env, sample_rate=sr)

nf, chroma = audio.nnls_chroma()                 # (n_frames, 12 x n_frames row-major)

loud = audio.lufs()  # integrated_lufs / momentary_lufs / short_term_lufs / loudness_range
mom = audio.momentary_lufs()                     # per-block time series
short = audio.short_term_lufs()

# Mastering chain — returns MasteringResult(samples, sample_rate,
# input_lufs, output_lufs, applied_gain_db, latency_samples)
result = audio.mastering(target_lufs=-14.0, ceiling_db=-1.0)
print(f"{result.input_lufs:.1f} LUFS → {result.output_lufs:.1f} LUFS "
      f"(gain {result.applied_gain_db:+.2f} dB)")

# Single processor / reference matching
compressed = libsonare.mastering_process(
    "dynamics.compressor", samples, sample_rate=44100,
    params={"thresholdDb": -24, "ratio": 1.5},
)
loudness_json = libsonare.mastering_pair_analyze(
    "match.referenceLoudness", source, reference, sample_rate=44100,
)

# Discover available processors
libsonare.mastering_processor_names()       # ['dynamics.compressor', ...]
libsonare.mastering_pair_processor_names()  # ['match.abCrossfade', ...]

# Preset-based chain (one-shot) + streaming
libsonare.mastering_preset_names()  # ['pop', 'edm', 'acoustic', 'hipHop', 'aiMusic', 'speech', 'streaming', 'youtube', 'broadcast', 'podcast', 'audiobook', 'cinema', 'jpop', 'ambient', 'lofi', 'classical']
result = libsonare.master_audio(samples, sample_rate=sr, preset='aiMusic',
                                 overrides={'loudness.targetLufs': -13})

with libsonare.StreamingMasteringChain({'eq.tilt.tiltDb': 0.5}) as chain:
    chain.prepare(44100, 512, 1)
    out = chain.process_mono([0.0] * 512)

# Mixing presets and offline stereo rendering
libsonare.mixing_scene_preset_names()  # ['vocalReverbSend', ...]
scene_json = libsonare.mixing_scene_preset_json("vocalReverbSend")
mix = libsonare.mix_stereo(
    [(vocal_l, vocal_r), (music_l, music_r)],
    sample_rate=sr,
    fader_db=[-3.0, -12.0],
    pan=[0.0, -0.2],
    width=[1.0, 0.9],
)

Python CLI

pip install libsonare

# Analysis
sonare analyze song.mp3
# > Estimated BPM : 161.00 BPM  (conf 75.0%)
# > Estimated Key : C major  (conf 100.0%)

sonare bpm song.mp3 --json

# Extended analysis (parity with the C++ CLI)
sonare acoustic room.wav --json          # blind RT60/EDT (add --ir for IR-based clarity metrics)
sonare lufs song.wav --series            # EBU R128 integrated/momentary/short-term
sonare rhythm song.wav --json
sonare dynamics song.wav --json
sonare timbre song.wav --json
sonare tempogram song.wav --json
sonare nnls-chroma song.wav --json

# Mastering
sonare mastering song.wav -o mastered.wav --target-lufs -14
sonare mastering-processor song.wav --processor dynamics.compressor \
    --params thresholdDb=-24,ratio=1.5 -o compressed.wav
sonare mastering-pair-analyze source.wav --reference reference.wav \
    --analysis match.referenceLoudness
sonare mastering-processors                 # list available processors

# Mixing
sonare mixing-presets
sonare mixing-preset --preset vocalReverbSend
sonare mix input.wav -o mix.wav --fader-db -3 --pan 0.1 --pan-mode stereo-pan --width 1.1

# DAW editing DSP
sonare pitch-correct vocal.wav --current-midi 69 --target-midi 70 -o corrected.wav
sonare note-stretch vocal.wav --onset 12000 --offset 24000 --ratio 1.25 -o stretched.wav
sonare voice-change vocal.wav --pitch-semitones 5 --formant-factor 1.1 -o changed.wav

C++

#include <sonare/sonare.h>           // analysis + features + effects
#include <sonare/mastering/master.h> // mastering chain & processors

auto audio = sonare::Audio::from_file("music.mp3");
auto result = sonare::MusicAnalyzer(audio).analyze();
std::cout << "BPM: " << result.bpm
          << ", Key: " << result.key.to_string() << std::endl;

Features

Analysis (librosa-compatible)

Music	Spectral / Pitch	Streaming
BPM / Tempo	STFT / iSTFT	Real-time analyzer
Key Detection	Mel Spectrogram	Incremental BPM
Beat / Downbeat tracking	MFCC	Incremental key
Time signature / meter	Chroma / NNLS chroma	Onset events
Chord (HMM / inversions)	CQT / VQT
Section Detection	Tempogram / PLP
Timbre / Dynamics	Spectral Features
Pitch (YIN / pYIN)	Onset Detection
RT60 / EDT / C50	Room acoustics
Loudness (EBU R128 LUFS)	Onset envelope

Mastering (70+ DSP processors)

Dynamics	EQ	Multiband / Stereo
Compressor	Parametric / Graphic	Multiband comp / EQ / limiter
Limiter / Brickwall	Linear / Minimum phase	Linkwitz-Riley crossover (phase-comp)
Expander / Gate	Dynamic EQ	Stereo imager / M-S / Haas
De-esser	Passive / stepped EQ	Phase align / mono maker / compat
Transient shaper	Tilt / shelving

Saturation / Repair	Maximizer / Match	Building blocks
Tube (Dempwolf 12AX7) / Tape	True-peak limiter (ITU-R BS.1770-4)	Polyphase FIR oversampler
Transformer / Exciter / Bitcrusher	Loudness optimizer (LUFS target)	ADAA-antialiased shaping
Declick / Declip / Decrackle	Adaptive release	Vicanek matched-Z biquads
Denoise / Dereverb / Dehum	Reference EQ / loudness / spectrum	Partitioned convolver

Repair is classical DSP by design. denoise_classical covers spectral subtraction, MMSE-STSA, and LogMMSE with explicit noise estimation; DNN restoration, source separation, and interactive spectral repair are out of scope.

EQ phase modes preserve existing coefficient defaults: Zero Latency keeps RBJ biquads for compatibility, while Natural Phase resolves bands through Vicanek matched-Z IIR. High-frequency shelf designs fall back to RBJ when the Vicanek endpoint gain error exceeds the fixed tolerance.

Mastering is built by default (BUILD_MASTERING=ON). Disable with cmake -DBUILD_MASTERING=OFF to ship analysis-only builds.

Mixing / routing

Channel strips	Routing / scene API	Metering / QA
Input trim / fader / polarity	Sends and FX buses	Peak / RMS / true peak
Balance / stereo / dual pan	Bus inserts and graph PDC	Correlation / mono width
Width and gain automation	C / Node / Python / WASM / CLI	Golden hashes and RT tests
Insert hosting and sidechain keys	Persistent scene mixers	No-allocation process checks

Mixing is built by default (BUILD_MIXING=ON) and depends on the mastering processor interfaces for insert hosting. Disable with cmake -DBUILD_MIXING=OFF for analysis/mastering-only builds.

Performance

Analysis runs natively in C++ and uses multi-threading where it helps (HPSS median filtering, the full analyze() pipeline). On the benchmark fixture, iterative algorithms such as HPSS and pYIN, and the full pipeline, are meaningfully faster than the equivalent librosa calls; single FFT-bound features (STFT, Mel, MFCC) are roughly on par. WebAssembly is single-threaded, so the multi-threaded gains do not apply there.

Mastering DSP uses ITU-spec polyphase oversampling, antiderivative anti-aliasing (ADAA), and Eigen for SIMD-friendly linear algebra on hot paths.

See Benchmarks for the methodology and per-feature numbers.

librosa Compatibility

Default parameters match librosa:

Sample rate: 22050 Hz
n_fft: 2048, hop_length: 512, n_mels: 128
fmin: 0.0, fmax: sr/2

Supported audio formats

Format	Default¹	With FFmpeg²	WASM (`@libraz/libsonare`)
WAV (PCM 16/24/32, float32)	✅	✅	n/a (samples in)
MP3	✅	✅	n/a
M4A / AAC / FLAC / OGG / Opus / WMA / ...	❌ (clear error message)	✅	n/a (use Web Audio API)

¹ Default: PyPI wheel (pip install libsonare) and source builds where FFmpeg dev libs are not present. PyPI wheels are deterministically pinned to this mode so installation never depends on the user's libavformat.

² With FFmpeg: source build with FFmpeg linked. CMake auto-detects via pkg-config (-DSONARE_WITH_FFMPEG=AUTO, the default for make build), and you can force on/off with -DSONARE_WITH_FFMPEG=ON/OFF. Python equivalent: SONARE_FFMPEG=1 pip install libsonare --no-binary libsonare. Node native: SONARE_FFMPEG=1 yarn build.

WASM does not bundle a file decoder by design; pass Float32Array samples obtained from the Web Audio API or another JS decoder.

Build from Source

# Native (auto-detects FFmpeg; mastering and mixing on by default)
make build && make test

# Analysis-only (smaller binary)
cmake -B build -DBUILD_MASTERING=OFF -DBUILD_MIXING=OFF && cmake --build build

# WebAssembly (mastering included)
make wasm

# Release (optimized)
make release

Documentation

Full docs and browser-local demos: libsonare.libraz.net (demos).

Learn first

Introduction · Getting Started · Installation · Examples

API by runtime

Browser / WASM · JavaScript · Python · Node.js Native · C++ · CLI

Build by task

Mastering Processors · Mixing Engine · Editing DSP · Realtime & Streaming · Room Acoustics

Understand the details

Architecture · librosa Compatibility · Benchmarks · Glossary

Non-goals

libsonare intentionally does not include:

Plugin-grade creative instruments/effects — use Tone.js, a plugin host, or your DAW
Audio synthesis (oscillators, samplers, MIDI playback) — out of scope
Real-time I/O abstraction (PortAudio/JACK wrappers) — callers handle I/O
DAW workflow (plugin host, automation, MIDI editing) — different product category
Deep-learning models (no bundled weights, no inference runtime) — keeps the library dependency-free and Apache-2.0 pure

These boundaries keep the library focused on analysis + mastering + mixer DSP and allow us to maintain the dependency-free property.

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 196 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
bindings		bindings
src		src
tests		tests
third_party		third_party
tools		tools
.clang-format		.clang-format
.editorconfig		.editorconfig
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
README_ja.md		README_ja.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

libsonare

Installation

Quick Start

JavaScript / TypeScript (WASM)

Python

Python CLI

C++

Features

Analysis (librosa-compatible)

Mastering (70+ DSP processors)

Mixing / routing

Performance

librosa Compatibility

Supported audio formats

Build from Source

Documentation

Non-goals

License

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

libsonare

Installation

Quick Start

JavaScript / TypeScript (WASM)

Python

Python CLI

C++

Features

Analysis (librosa-compatible)

Mastering (70+ DSP processors)

Mixing / routing

Performance

librosa Compatibility

Supported audio formats

Build from Source

Documentation

Non-goals

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages