Video Subtitle Remover Pro

Professional AI-powered tool for removing hard-coded subtitles from videos and images

Overview

Video Subtitle Remover Pro uses real AI neural networks to remove hard-coded subtitles and text watermarks from videos and images. Unlike simple blur or crop methods, it intelligently fills in removed areas with content that matches the surrounding video.

Based on YaoFANGUK/video-subtitle-remover, enhanced with a professional interface, real LaMa inpainting, multi-engine detection, and 12-language support.

Features

Real Video Inpainting -- Temporal Background Exposure (TBE) reconstructs the true background from neighbouring frames where the subtitle is absent. No external model weight downloads required.
Real AI Inpainting -- LaMa neural network for still-frame and residual refinement (via simple-lama-inpainting)
AUTO Inpaint Routing -- Per-batch routing between TBE and LaMa based on exposure score
Multi-Engine Detection -- RapidOCR (ONNX PP-OCR, 4-5x faster, leak-free) > PaddleOCR > Surya (GPL opt-in) > EasyOCR > OpenCV fallback chain (automatic)
Lossless Pipeline -- FFV1 lossless intermediate (only the final encode is lossy) for noticeably cleaner outputs than the legacy mp4v intermediate
HEVC + AV1 Output -- Pick H.264 / H.265 / AV1 from a dropdown; NVENC/QSV/AMF for HW encoding, libx265 / libsvtav1 software fallback
Multi-region Masks -- Draw multiple subtitle rects on a scrubbable video frame; backend honours every rect
Inpaint Preview -- "Preview cleanup" button runs detect + inpaint on the selected frame so you can A/B settings before committing
Seamless Boundaries -- Gaussian alpha feathering at every inpaint boundary, no visible cut lines
~50 Language Support -- English / Chinese / Japanese / Korean / European, plus Thai, Vietnamese, Polish, Greek, Ukrainian, Filipino, Hebrew, Czech, and more
GPU Acceleration -- NVIDIA CUDA, AMD/Intel DirectML, hardware-decode hints (D3D11 / VAAPI / MFX), CPU fallback
Subtitle Region Selector -- Scrub to any frame and draw one or more rectangles
Batch Processing -- Queue files or drag entire folders; per-item cancellation
Multi-track Audio + Loudness Normalisation -- Pass through every audio track on Bluray rips; optional per-stream EBU R128 normalisation to LUFS targets (YouTube -14, Apple -16, broadcast -23)
Quality Self-Test -- PSNR / SSIM report with an ROI-cropped metric (measures the inpaint region, not the unchanged background) and an optional side-by-side comparison PNG
CLI + Presets -- python -m backend.processor --pattern ... --preset "YouTube (default)"; six built-in presets + user presets persisted to %APPDATA%
Chyron vs Subtitle Filter -- Keep persistent text (logos, lower-thirds) and remove dialogue, or vice versa
Karaoke Grouping -- Per-syllable boxes fuse into a single line mask so highlighted lyrics do not leak through the gaps
Live Preview During Processing -- 15 FPS throttled preview piped from the backend worker
Pre-batch ETA Estimate -- 30-frame detect probe seeds the ETA so users see "about X left" from the very first frame
Crash-Resume Checkpointing -- SHA-256 input fingerprint per file; re-running a glob skips finished work
Premium Dark UI -- Cohesive design system with custom sliders, toggles, status chips, taskbar progress, onboarding modal
Settings Persistence -- All knobs saved/restored between sessions; versioned schema with backfill migration
CI/CD Releases -- Automated Windows builds via GitHub Actions, pip-audit dependency scan included

System Requirements

Component	Minimum	Recommended
OS	Windows 10	Windows 11
CPU	Intel i5 / AMD Ryzen 5	Intel i7 / AMD Ryzen 7
RAM	8 GB	16+ GB
GPU	Any (CPU mode)	NVIDIA RTX 2060+
VRAM	-	6+ GB
Python	3.10	3.12

Installation

Quick Install

Download or clone this repository
Double-click Run_VSR_Pro.bat — first run automatically:
- Creates a virtual environment
- Detects your GPU and installs appropriate packages
- Installs PaddleOCR, EasyOCR, and LaMa inpainting
- Launches the application
- Use Run_VSR_Pro_Debug.bat if you want the same bootstrap flow with a visible console for troubleshooting

Manual Install

cd VideoSubtitleRemover

# Create virtual environment
python -m venv venv
.\venv\Scripts\activate

# Install PyTorch (choose one):
# NVIDIA:
pip install torch==2.7.0 torchvision==0.22.0 --index-url https://download.pytorch.org/whl/cu118
# CPU:
pip install torch==2.7.0 torchvision==0.22.0 --index-url https://download.pytorch.org/whl/cpu

# Install dependencies
pip install -r requirements.txt

# Run
python VideoSubtitleRemover.py

FFmpeg (Required for audio)

winget install ffmpeg

Validation

python -m unittest discover -s tests -v

Usage

Launch via Run_VSR_Pro.bat
Add files — Click to browse, press Ctrl+O, right-click for folders, or drag & drop
Select algorithm — LAMA (recommended), STTN, or ProPainter
Set language if subtitles are non-English
Optionally set region — Click "Set Region" to draw a rectangle on the subtitle area
Start Processing and monitor progress
Select a queue item to preview it, use Review mask to confirm detection, and double-click the preview for a larger source frame

Algorithm Comparison

Algorithm	Inpainting Engine	Speed	Quality	Best For
STTN	Temporal Background Exposure	Fastest	Great	Live-action video with changing subtitles (default)
LAMA	Neural (LaMa)	Medium	Best still-frame	Images, animations, static backgrounds
ProPainter	TBE + LaMa refinement	Slowest	Best motion	Motion-heavy footage, thick/decorative text

All three modes now do real inpainting. STTN recovers the literal background from adjacent frames where the subtitle is absent -- this works because hard-coded subtitles are sparse in time, and the pixels behind them are revealed whenever the text changes or disappears. LAMA is a single-frame neural fill. ProPainter is a hybrid: TBE reconstructs the background, then LaMa refines any residual.

Detection Engines

The app automatically selects the best available engine:

Priority	Engine	Install	Languages	Notes
1	RapidOCR (ONNX PP-OCR)	`pip install rapidocr`	100+	4-5x faster than PaddleOCR, leak-free (default)
2	PaddleOCR (PP-OCRv5)	`pip install paddleocr>=3.0.0`	106	High accuracy reference implementation
3	Surya	`pip install surya-ocr`	90+	Layout-aware (GPL)
4	EasyOCR	`pip install easyocr`	80+	Legacy fallback
5	OpenCV fallback	Built-in	Any	Threshold-based

CLI Usage

Process files from the command line:

python -m backend.processor -i input.mp4 -o output.mp4 -m lama --lang en --crf 20

Flag	Description	Default
`-i`, `--input`	Input file path	Required
`-o`, `--output`	Output file path	Required
`--pattern`	Glob pattern for batch (e.g. `inputs/*.mp4`)	-
`--out-dir`	Output directory for batch mode	-
`--config`	JSON config overlay	-
`--preset NAME`	Apply a built-in or user preset by name	-
`--list-presets`	List every preset and exit	-
`-m`, `--mode`	Algorithm (sttn/lama/propainter/auto)	sttn
`--codec`	Output codec (h264/h265/av1)	h264
`-g`, `--gpu`	GPU device ID (-1 for CPU)	0
`-l`, `--lang`	Detection language	en
`--crf`	Output quality (15-35, lower=better)	23
`--skip-detection`	Use manual region only	Off
`--fast`	LAMA fast mode	Off
`--no-audio`	Strip audio	Off
`--single-audio`	Mux only first audio stream	Off
`--loudnorm <LUFS>`	EBU R128 loudness target (0 disables)	0
`--frame-skip N`	Reuse mask for N frames (0=every frame)	0
`--mask-dilate N`	Expand masks by N pixels	8
`--no-hw-encode`	Force software encoding	Off
`--decode-accel`	HW decode hint (off/auto/d3d11/vaapi/mfx)	off
`--keep-chyrons`	Leave persistent text (logos / lower-thirds)	Off
`--keep-subtitles`	Leave dialogue subtitles	Off
`--karaoke-grouping`	Fuse per-syllable boxes on the same line	Off
`--quality-report`	Compute PSNR/SSIM after each run	Off
`--quality-sheet`	Side-by-side comparison PNG	Off
`--validate-config`	Print resolved config and exit	Off
`--skip-existing`	Skip files whose output already exists	Off
`--no-prefetch`	Disable worker-thread frame prefetcher	Off
`--json-log PATH`	Append a structured JSON-line log	-

Configuration

Settings are stored in %APPDATA%\VideoSubtitleRemoverPro\settings.json and persist across sessions.

Advanced Settings

Setting	Description	Default	Range
Neighbor Stride	STTN temporal window	10	5-30
Reference Length	STTN reference frames	10	5-30
Max Load Frames	Batch size	30	10-100
CRF Quality	Output quality (lower=better)	23	15-35
Output Codec	H.264 / H.265 / AV1	h264	h264/h265/av1
Frame Skip	Reuse detection mask for N frames	0	0-10
Mask Dilate	Expand detected regions (px)	8	0-20
Mask Feather	Soft alpha-blend at boundary (px)	4	0-15
TBE Coverage	Min frames a pixel must be unmasked to trust its exposure	3	1-10
HW Encoding	Use NVENC/QSV/AMF if available	On	On/Off
HW Decode Hint	cv2 HW-accel hint with software fallback	off	off/auto/d3d11/vaapi/mfx
Loudness Target	EBU R128 LUFS target (0 = off)	0	0 or -70..-5
Multi-track Audio	Pass through every audio stream	On	On/Off
Quality Sheet	Side-by-side PNG next to output	Off	On/Off

Troubleshooting

CUDA out of memory

Reduce Max Load Frames in Advanced Settings
Switch to LAMA mode (lower VRAM)
Use CPU mode as fallback

No audio in output

Install FFmpeg: winget install ffmpeg
Ensure "Preserve original audio" is checked

Poor detection accuracy

Try changing the detection language to match your subtitles
Use "Set Region" to manually define the subtitle area
Install PaddleOCR for best detection accuracy

Application won't start

Ensure Python 3.10+ is installed
Delete venv folder and re-run setup
Try Run_VSR_Pro_Debug.bat to keep the console open during startup
Check the log file: %APPDATA%\VideoSubtitleRemoverPro\vsr_pro.log

Log Files

GUI log panel (collapsible, click "Open Log File" for full log)
File log: %APPDATA%\VideoSubtitleRemoverPro\vsr_pro.log (5MB rotating)

Project Structure

VideoSubtitleRemover/
|-- VideoSubtitleRemover.py   # Main GUI application
|-- backend/
|   |-- __init__.py           # Module exports
|   |-- processor.py          # Core processing (detection + inpainting + mux)
|   |-- presets.py            # Shared preset library (GUI + CLI)
|   `-- model_hashes.py       # Vendored SHA-256 weight hashes
|-- docs/
|   `-- architecture.md       # Pipeline map for new contributors
|-- ROADMAP.md                # Shipped log + ordered backlog + research bench
|-- TODO.md                   # Active checklist (single source of truth)
|-- RESEARCH_FEATURE_PLAN.md  # Audit companion (historical analysis)
|-- setup.py                  # First-time environment setup
|-- Run_VSR_Pro.bat           # Windows launcher
|-- Run_VSR_Pro_Debug.bat     # Windows launcher with a visible console
|-- build_exe.bat             # PyInstaller build script
|-- requirements.txt          # Python dependencies
|-- tests/                    # Focused regression coverage for hardened paths
|-- .github/workflows/
|   `-- build.yml             # CI/CD release workflow + pip-audit
|-- assets/                   # Application assets
|-- models/                   # AI model weights (auto-downloaded)
`-- output/                   # Default output location

See docs/architecture.md for a walkthrough of the detect -> tracker -> mask -> TBE -> refine -> mux pipeline and the "add a new feature" checklist.

Credits

Original project: YaoFANGUK/video-subtitle-remover
LaMa inpainting: simple-lama-inpainting
EasyOCR: JaidedAI/EasyOCR
STTN: Learning Joint Spatial-Temporal Transformations
ProPainter: sczhou/ProPainter

License

This project is licensed under the MIT License.

Video Subtitle Remover Pro -- Built by SysAdminDoc

Report Bug | Request Feature

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Subtitle Remover Pro

Overview

Features

System Requirements

Installation

Quick Install

Manual Install

FFmpeg (Required for audio)

Validation

Usage

Algorithm Comparison

Detection Engines

CLI Usage

Configuration

Advanced Settings

Troubleshooting

Log Files

Project Structure

Credits

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
assets		assets
backend		backend
docs		docs
installer		installer
locale		locale
models		models
output		output
scripts		scripts
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
RESEARCH_FEATURE_PLAN.md		RESEARCH_FEATURE_PLAN.md
ROADMAP.md		ROADMAP.md
Run_VSR_Pro.bat		Run_VSR_Pro.bat
Run_VSR_Pro_Debug.bat		Run_VSR_Pro_Debug.bat
TODO.md		TODO.md
VideoSubtitleRemover.py		VideoSubtitleRemover.py
build_exe.bat		build_exe.bat
icon.ico		icon.ico
icon.png		icon.png
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

Video Subtitle Remover Pro

Overview

Features

System Requirements

Installation

Quick Install

Manual Install

FFmpeg (Required for audio)

Validation

Usage

Algorithm Comparison

Detection Engines

CLI Usage

Configuration

Advanced Settings

Troubleshooting

Log Files

Project Structure

Credits

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages