vo_postprocess: add overlay module (native alpha blending, 9 codecs)#495
Open
benroeder wants to merge 1 commit intoCESNET:masterfrom
Open
vo_postprocess: add overlay module (native alpha blending, 9 codecs)#495benroeder wants to merge 1 commit intoCESNET:masterfrom
benroeder wants to merge 1 commit intoCESNET:masterfrom
Conversation
## Summary
New `overlay` postprocessor that blends a PAM image onto video in its
native pixel format — no intermediate RGBA conversion, no external
converter. Codecs: RGBA, RGB, UYVY, YUYV, v210, R10k, R12L, Y416, RG48,
I420.
```
-p overlay:file=<path>[:position=<pos>][:custom_x=<x>:custom_y=<y>]
[:soft_edge=<n>][:scale=<W>x<H>|frame[:scale_filter=<f>]]
[:blend_threads=<N>][:perf]
```
`overlay:help` lists every keyword.
## Design
- 16-bit RGBA internal; blend functions do conversion + blend in one
pass via the existing `RGB_TO_Y/CB/CR` macros from `color_space.h`.
- YCbCr-space blending for YUV destinations (Apple TN2307).
- Async hot-reload: file watcher kicks a worker; postprocess harvests
on the next frame. Adds `task_is_done()` to `utils/worker.{h,cpp}`.
- Per-row blend parallelism via `task_run_parallel`, threshold-gated
at 500k pixels. Default `blend_threads = min(ncpu, 8)`.
- Oversized overlay: `overlay_calc_rect` returns `src_x`/`src_y` so
centred/right-anchored oversized overlays show the correct slice
instead of being clamped to top-left.
- `scale=frame`: re-scales the overlay whenever the source resolution
changes. Per-frame dim check; ~1 frame of stale overlay during
renegotiation; postprocess never blocks on the rescale.
- Multiple `overlay:` instances chain via comma in one `--postprocess`.
## Performance
I420 1080p overlay, 4K30: **3.84 ms → 0.71 ms** (5.4×) with 8 blend
threads. Fusion of the Y/UV passes halves DRAM traffic; threading
provides the rest.
## Test plan
- Unit (`bin/run_tests`): per-codec correctness, PAM variants, config
parser, layout/snap, soft edge, watcher, scaler.
- E2E (`test/run_overlay_e2e.sh`, 19 assertions): per-codec dump
bytes, threaded-blend parity (RGBA + I420), dual-overlay chain,
oversized centre, `scale=frame` stretch.
- Optional RapidCheck (`./ext-deps/bootstrap_rapidcheck.sh &&
./configure --enable-rapidcheck && make rapidcheck-tests`): 33
properties × 100 cases, covers blend identity/monotonicity/no-overrun,
layout, soft edge, watcher.
- Visual: `test/run_overlay_visual.sh` (all codecs at HD + 4K),
`test/run_scale_frame_visual.sh` (resolution-tracking demo).
- Benchmarks: `test/benchmark_overlay_{matrix,scale}.sh`.
## Known limitations
- BT.2020 not supported (codebase-wide gap in `get_color_coeffs`).
- PAM has no gamma metadata; transfer-function correction ignored
(matches `pixfmt_conv.c`).
- Straight alpha only.
- `scale=frame` stretches without preserving aspect ratio.
## Files
- `src/vo_postprocess/overlay.c`
- `src/utils/alpha_blend.{c,h}` and
`src/utils/overlay_{config,layout,pam,scale,soft_edge,watch}.{c,h}`
- `src/utils/worker.{h,cpp}` — `+task_is_done()`
- `test/test_*` (unit + property), `test/run_overlay_e2e.sh`,
benchmarks, visual demos
- `ext-deps/bootstrap_rapidcheck.sh`, `configure.ac`, `Makefile.in`
— opt-in `--enable-rapidcheck` path
No upstream files modified outside `worker.{h,cpp}`, `configure.ac`,
`Makefile.in`, `.gitignore`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds an overlay postprocessor that blends a PAM image onto video in
its native pixel format. Supports RGBA, RGB, UYVY, YUYV, v210, R10k,
R12L, Y416, RG48, and I420 - conversion + blend in one pass per
codec, no intermediate format. Hot-reload via mtime/size watcher,
optional libswscale resize (scale=WxH or scale=frame for resolution
tracking), soft-edge alpha fade, pthread-pool blend parallelism.
Multiple instances chain via comma in --postprocess.
Usage:
overlay:helpprints the keyword reference.Implementation:
macros + get_color_coeffs() so --color-601 still works.
task_is_done() (added to utils/worker.{h,cpp}) and harvests on the
next frame.
500k pixels. Default blend_threads = min(ncpu, 8).
region maps to the correct slice (centre/right) instead of being
clamped to top-left.
frame in overlay_postprocess (not the reconfigure callback - some
pipeline paths skip it).
Tests:
config parser, layout, soft edge, watcher, scaler.
verification, threaded-blend parity (RGBA and I420), dual-overlay
chain, oversized-centre, scale=frame stretch.
--enable-rapidcheck + make rapidcheck-tests): 33 properties x 100
cases.
Performance: I420 1080p overlay at 4K30 runs in 3.84 ms blend
single-threaded, 0.71 ms with 8 threads (5.4x). Other codecs scale
similarly under the same MIN_PARALLEL_PIXELS gate.
Limitations:
(matches pixfmt_conv.c).
Files:
No upstream files modified outside worker.{h,cpp}, configure.ac,
Makefile.in, .gitignore.