Skip to content

vo_postprocess: add overlay module (native alpha blending, 9 codecs)#495

Open
benroeder wants to merge 1 commit intoCESNET:masterfrom
benroeder:overlay-native-blend
Open

vo_postprocess: add overlay module (native alpha blending, 9 codecs)#495
benroeder wants to merge 1 commit intoCESNET:masterfrom
benroeder:overlay-native-blend

Conversation

@benroeder
Copy link
Copy Markdown
Contributor

Adds an overlay postprocessor that blends a PAM image onto video in
its native pixel format. Supports RGBA, RGB, UYVY, YUYV, v210, R10k,
R12L, Y416, RG48, and I420 - conversion + blend in one pass per
codec, no intermediate format. Hot-reload via mtime/size watcher,
optional libswscale resize (scale=WxH or scale=frame for resolution
tracking), soft-edge alpha fade, pthread-pool blend parallelism.
Multiple instances chain via comma in --postprocess.

Usage:

-p overlay:file=PATH[:position=POS][:custom_x=X:custom_y=Y]
          [:soft_edge=N][:scale=WxH|frame[:scale_filter=F]]
          [:blend_threads=N][:perf]

overlay:help prints the keyword reference.

Implementation:

  • 16-bit RGBA internal. Blend functions use the existing RGB_TO_Y/CB/CR
    macros + get_color_coeffs() so --color-601 still works.
  • YCbCr-space blend for YUV destinations.
  • Async reload off the postprocess thread; postprocess polls
    task_is_done() (added to utils/worker.{h,cpp}) and harvests on the
    next frame.
  • Per-row blend dispatched via task_run_parallel, threshold-gated at
    500k pixels. Default blend_threads = min(ncpu, 8).
  • Oversized overlay returns src_x/src_y on the rect so the visible
    region maps to the correct slice (centre/right) instead of being
    clamped to top-left.
  • scale=frame re-renders on source resolution changes, detected per
    frame in overlay_postprocess (not the reconfigure callback - some
    pipeline paths skip it).

Tests:

  • Unit (bin/run_tests): per-codec blend correctness, PAM loader,
    config parser, layout, soft edge, watcher, scaler.
  • E2E (test/run_overlay_e2e.sh): 19 assertions - per-codec exact-byte
    verification, threaded-blend parity (RGBA and I420), dual-overlay
    chain, oversized-centre, scale=frame stretch.
  • Optional rapidcheck (./ext-deps/bootstrap_rapidcheck.sh +
    --enable-rapidcheck + make rapidcheck-tests): 33 properties x 100
    cases.
  • Visual: test/run_overlay_visual.sh, test/run_scale_frame_visual.sh.
  • Benchmarks: test/benchmark_overlay_*.sh.

Performance: I420 1080p overlay at 4K30 runs in 3.84 ms blend
single-threaded, 0.71 ms with 8 threads (5.4x). Other codecs scale
similarly under the same MIN_PARALLEL_PIXELS gate.

Limitations:

  • BT.2020 not supported (codebase-wide gap in get_color_coeffs).
  • PAM has no gamma metadata; transfer-function correction ignored
    (matches pixfmt_conv.c).
  • Straight alpha only.
  • scale=frame stretches without preserving aspect ratio.

Files:

  • src/vo_postprocess/overlay.c
  • src/utils/alpha_blend.{c,h}
  • src/utils/overlay_{config,layout,pam,scale,soft_edge,watch}.{c,h}
  • src/utils/worker.{h,cpp} (+task_is_done)
  • test/* (unit, property, e2e, visual, benchmark)
  • ext-deps/bootstrap_rapidcheck.sh, configure.ac, Makefile.in (opt-in)

No upstream files modified outside worker.{h,cpp}, configure.ac,
Makefile.in, .gitignore.

## Summary

New `overlay` postprocessor that blends a PAM image onto video in its
native pixel format — no intermediate RGBA conversion, no external
converter. Codecs: RGBA, RGB, UYVY, YUYV, v210, R10k, R12L, Y416, RG48,
I420.

```
-p overlay:file=<path>[:position=<pos>][:custom_x=<x>:custom_y=<y>]
          [:soft_edge=<n>][:scale=<W>x<H>|frame[:scale_filter=<f>]]
          [:blend_threads=<N>][:perf]
```

`overlay:help` lists every keyword.

## Design

- 16-bit RGBA internal; blend functions do conversion + blend in one
  pass via the existing `RGB_TO_Y/CB/CR` macros from `color_space.h`.
- YCbCr-space blending for YUV destinations (Apple TN2307).
- Async hot-reload: file watcher kicks a worker; postprocess harvests
  on the next frame. Adds `task_is_done()` to `utils/worker.{h,cpp}`.
- Per-row blend parallelism via `task_run_parallel`, threshold-gated
  at 500k pixels. Default `blend_threads = min(ncpu, 8)`.
- Oversized overlay: `overlay_calc_rect` returns `src_x`/`src_y` so
  centred/right-anchored oversized overlays show the correct slice
  instead of being clamped to top-left.
- `scale=frame`: re-scales the overlay whenever the source resolution
  changes. Per-frame dim check; ~1 frame of stale overlay during
  renegotiation; postprocess never blocks on the rescale.
- Multiple `overlay:` instances chain via comma in one `--postprocess`.

## Performance

I420 1080p overlay, 4K30: **3.84 ms → 0.71 ms** (5.4×) with 8 blend
threads. Fusion of the Y/UV passes halves DRAM traffic; threading
provides the rest.

## Test plan

- Unit (`bin/run_tests`): per-codec correctness, PAM variants, config
  parser, layout/snap, soft edge, watcher, scaler.
- E2E (`test/run_overlay_e2e.sh`, 19 assertions): per-codec dump
  bytes, threaded-blend parity (RGBA + I420), dual-overlay chain,
  oversized centre, `scale=frame` stretch.
- Optional RapidCheck (`./ext-deps/bootstrap_rapidcheck.sh &&
  ./configure --enable-rapidcheck && make rapidcheck-tests`): 33
  properties × 100 cases, covers blend identity/monotonicity/no-overrun,
  layout, soft edge, watcher.
- Visual: `test/run_overlay_visual.sh` (all codecs at HD + 4K),
  `test/run_scale_frame_visual.sh` (resolution-tracking demo).
- Benchmarks: `test/benchmark_overlay_{matrix,scale}.sh`.

## Known limitations

- BT.2020 not supported (codebase-wide gap in `get_color_coeffs`).
- PAM has no gamma metadata; transfer-function correction ignored
  (matches `pixfmt_conv.c`).
- Straight alpha only.
- `scale=frame` stretches without preserving aspect ratio.

## Files

- `src/vo_postprocess/overlay.c`
- `src/utils/alpha_blend.{c,h}` and
  `src/utils/overlay_{config,layout,pam,scale,soft_edge,watch}.{c,h}`
- `src/utils/worker.{h,cpp}` — `+task_is_done()`
- `test/test_*` (unit + property), `test/run_overlay_e2e.sh`,
  benchmarks, visual demos
- `ext-deps/bootstrap_rapidcheck.sh`, `configure.ac`, `Makefile.in`
  — opt-in `--enable-rapidcheck` path

No upstream files modified outside `worker.{h,cpp}`, `configure.ac`,
`Makefile.in`, `.gitignore`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant