Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions examples/340-tauri-live-transcription-rust-ts/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Deepgram — https://console.deepgram.com/
DEEPGRAM_API_KEY=
57 changes: 57 additions & 0 deletions examples/340-tauri-live-transcription-rust-ts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Tauri Desktop Live Transcription

A cross-platform desktop app built with Tauri v2 that captures microphone audio, streams it to Deepgram via WebSocket for real-time transcription, and displays live captions. Tauri's Rust backend handles the Deepgram connection using the official Rust SDK, while the TypeScript frontend captures audio and renders the UI.

## What you'll build

A Tauri desktop application with a Rust backend that connects to Deepgram's live STT WebSocket using the Deepgram Rust SDK and a TypeScript frontend that captures microphone audio at 16 kHz, streams it to the backend via Tauri commands, and displays rolling live captions with interim and final results.

## Prerequisites

- [Rust](https://www.rust-lang.org/tools/install) 1.70+
- [Node.js](https://nodejs.org/) 18+
- System WebView (WebKitGTK on Linux, WebView2 on Windows, WebKit on macOS) — see [Tauri prerequisites](https://v2.tauri.app/start/prerequisites/)
- Deepgram account — [get a free API key](https://console.deepgram.com/)

## Environment variables

| Variable | Where to find it |
|----------|-----------------|
| `DEEPGRAM_API_KEY` | [Deepgram console](https://console.deepgram.com/) |

## Install and run

```bash
cp .env.example .env
# Add your DEEPGRAM_API_KEY to .env

cd src
npm install
npm run tauri dev
```

## Key parameters

| Parameter | Value | Description |
|-----------|-------|-------------|
| `model` | `nova-3` | Deepgram's latest and most accurate STT model |
| `encoding` | `linear16` | 16-bit PCM audio from the microphone |
| `sample_rate` | `16000` | 16 kHz — good balance of quality and bandwidth |
| `interim_results` | `true` | Show partial transcripts as the user speaks |
| `smart_format` | `true` | Auto-capitalisation, numbers, and punctuation |
| `utterance_end_ms` | `1500` | Detect end of speech after 1.5 s of silence |

## How it works

1. The Tauri app starts with a Rust backend and a web-based frontend rendered in the system WebView
2. When you click **Start**, the TypeScript frontend requests microphone access via `getUserMedia` at 16 kHz
3. A `ScriptProcessorNode` captures raw PCM audio and converts float32 samples to signed 16-bit linear PCM
4. Audio chunks are sent to the Rust backend via Tauri's `invoke("send_audio", ...)` IPC
5. The Rust backend connects to Deepgram's live STT WebSocket using the official `deepgram` Rust crate with `transcription().stream_request_with_options(...).handle()`
6. A `tokio::select!` loop multiplexes audio forwarding and transcript receiving on the same `WebsocketHandle`
7. Transcript events (interim and final) are emitted back to the frontend via Tauri's event system (`app.emit("transcript", ...)`)
8. The frontend renders rolling captions with final text in white and interim text in grey

## Starter templates

[deepgram-starters](https://github.com/orgs/deepgram-starters/repositories)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
deepgram-sdk==6.1.1
115 changes: 115 additions & 0 deletions examples/340-tauri-live-transcription-rust-ts/src/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Deepgram Live Transcription</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }

body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
background: #0f0f14;
color: #e0e0e0;
min-height: 100vh;
display: flex;
flex-direction: column;
align-items: center;
padding: 32px 24px;
}

h1 {
font-size: 22px;
font-weight: 600;
margin-bottom: 8px;
color: #ffffff;
}

.subtitle {
font-size: 13px;
color: rgba(255, 255, 255, 0.4);
margin-bottom: 24px;
}

#transcript-container {
background: rgba(255, 255, 255, 0.05);
border: 1px solid rgba(255, 255, 255, 0.1);
border-radius: 12px;
padding: 20px 24px;
width: 100%;
max-width: 640px;
min-height: 200px;
flex: 1;
overflow-y: auto;
}

#transcript {
font-size: 18px;
line-height: 1.6;
color: #ffffff;
}

#transcript .interim {
color: rgba(255, 255, 255, 0.4);
}

#controls {
display: flex;
gap: 12px;
margin-top: 20px;
align-items: center;
}

button {
background: rgba(255, 255, 255, 0.1);
border: 1px solid rgba(255, 255, 255, 0.2);
color: #fff;
padding: 10px 24px;
border-radius: 8px;
cursor: pointer;
font-size: 14px;
font-weight: 500;
transition: all 0.15s ease;
}

button:hover:not(:disabled) { background: rgba(255, 255, 255, 0.18); }
button:disabled { opacity: 0.35; cursor: not-allowed; }
button.active { background: rgba(19, 239, 147, 0.25); border-color: #13ef93; }

.status {
font-size: 12px;
color: rgba(255, 255, 255, 0.35);
margin-left: 8px;
}

.status.connected { color: #13ef93; }
.status.error { color: #ff6b6b; }
.status.disconnected { color: rgba(255, 255, 255, 0.35); }

.logo {
display: flex;
align-items: center;
gap: 8px;
margin-bottom: 4px;
}
</style>
</head>
<body>
<div class="logo">
<h1>Deepgram Live Transcription</h1>
</div>
<p class="subtitle">Tauri + Rust + Deepgram Nova-3</p>

<div id="transcript-container">
<div id="transcript">Click Start to begin transcription...</div>
</div>

<div id="controls">
<button id="btn-start">Start</button>
<button id="btn-stop" disabled>Stop</button>
<span id="status" class="status">disconnected</span>
</div>

<script type="module" src="/src/main.ts"></script>
</body>
</html>
19 changes: 19 additions & 0 deletions examples/340-tauri-live-transcription-rust-ts/src/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{
"name": "deepgram-tauri-live-transcription",
"private": true,
"version": "0.1.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "tsc && vite build",
"tauri": "tauri"
},
"dependencies": {
"@tauri-apps/api": "2.2.0"
},
"devDependencies": {
"@tauri-apps/cli": "2.2.0",
"typescript": "5.7.3",
"vite": "6.0.0"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
[package]
name = "deepgram-tauri-live-transcription"
version = "0.1.0"
edition = "2021"

[dependencies]
deepgram = "0.9.1"
dotenv = "0.15"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tauri = { version = "2", features = [] }
tokio = { version = "1", features = ["full"] }

[build-dependencies]
tauri-build = { version = "2", features = [] }
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
fn main() {
tauri_build::build()
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"identifier": "default",
"description": "Capability for the main window",
"windows": ["main"],
"permissions": [
"core:default"
]
}
Loading
Loading