Skip to content

[FIX] Sync Rust inputfile array length with num_input_files to fix heap OOB read#2218

Open
NexionisJake wants to merge 1 commit intoCCExtractor:masterfrom
NexionisJake:fix/inputfile-array-oob-mismatch
Open

[FIX] Sync Rust inputfile array length with num_input_files to fix heap OOB read#2218
NexionisJake wants to merge 1 commit intoCCExtractor:masterfrom
NexionisJake:fix/inputfile-array-oob-mismatch

Conversation

@NexionisJake
Copy link

In copy_from_rust() (src/rust/src/common.rs), num_input_files was
computed by filtering empty strings from the inputfiles Vec, but
string_to_c_chars() was called on an unfiltered clone of the same Vec.
This mismatch made the C array length and num_input_files disagree:
switch_to_next_file() could index inputfile[current_file] where
current_file < num_input_files but >= array size, reading one slot past
the end of the allocated array — confirmed by AddressSanitizer
(heap-buffer-overflow at file_functions.c:183).

The same count/size mismatch also caused free_rust_c_string_array() to
reconstruct the Vec with an incorrect capacity, producing heap corruption on
every clean shutdown.

Fix: filter empty strings into a single Vec<String> first, then derive both
num_input_files (filtered.len()) and the C array
(string_to_c_chars(filtered)) from that same source, eliminating the
mismatch entirely.

Fixes #2182

In raising this pull request, I confirm the following (please check boxes):

Reason for this PR:

  • This PR adds new functionality.
  • This PR fixes a bug that I have personally experienced or that a real user has reported and for which a sample
    exists.
  • This PR is porting code from C to Rust.

Sanity check:

  • I have read and understood the contributors
    guide
    .
  • I have checked that another pull request for this purpose does not exist.
  • If the PR adds new functionality, I've added it to the changelog. If it's just a bug fix, I have NOT added it to
    the changelog.
  • I am NOT adding new C code unless it's to fix an existing, reproducible bug.

Repro instructions:

Build CCExtractor with AddressSanitizer enabled:

mkdir build && cd build
cmake ../src -DCMAKE_C_FLAGS="-fsanitize=address" -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=address"
make -j$(nproc)

Run against any input file that goes through switch_to_next_file() (i.e. any normal sample):

./ccextractor sample.ts -out=srt

ASAN output before this fix:
==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000007998
READ of size 8 at 0x602000007998
0x602000007998 is located 0 bytes after 8-byte region [0x602000007990,0x602000007998)
    #0 switch_to_next_file  src/lib_ccx/file_functions.c:183

Note: PR #2183 addresses the same issue. This PR applies the same fix with a
cleaner implementation — keeping the original if let Some(ref inputfile)
binding instead of the ref _inputfile + .as_ref().unwrap() pattern used
in #2183. The CI regressions shown in #2183 are pre-existing master failures
unconfirmed as caused by that PR (the Windows CI bot explicitly noted those
tests were already failing on master before the PR).

… num_input_files

The Rust FFI function copy_from_rust() computed num_input_files by filtering
empty strings from the inputfiles Vec, but passed an unfiltered clone to
string_to_c_chars() to build the C inputfile[] array. This mismatch made the
C array length and num_input_files disagree: switch_to_next_file() could index
inputfile[current_file] where current_file < num_input_files but >= array size,
reading one slot past the end of the allocated array — confirmed by
AddressSanitizer (heap-buffer-overflow at file_functions.c:183).

The same count/size mismatch also caused free_rust_c_string_array() to
reconstruct the Vec with an incorrect capacity, producing heap corruption on
every clean shutdown.

Fix: filter empty strings into a single Vec<String> first, then derive both
num_input_files (filtered.len()) and the C array (string_to_c_chars(filtered))
from that same source, eliminating the mismatch entirely.

Fixes CCExtractor#2182

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 03ad9e8...:
Report Name Tests Passed
Broken 10/13
CEA-708 1/14
DVB 4/7
DVD 3/3
DVR-MS 2/2
General 27/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 79/86
Teletext 20/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

NOTE: The following tests have been failing on the master branch as well as the PR:

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 8e8229b88b..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 132d7df7e9..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 99e5eaafdc..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 7aad20907e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 01509e4d27..., Last passed: Never
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b..., Last passed: Never
  • ccextractor --out=spupng c83f765c66..., Last passed: Never
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla 85058ad37e..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --ucla b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 7f41299cc7..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 03ad9e8...:
Report Name Tests Passed
Broken 10/13
CEA-708 1/14
DVB 3/7
DVD 3/3
DVR-MS 2/2
General 25/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 75/86
Teletext 20/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
  • ccextractor --out=spupng c83f765c66...
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

NOTE: The following tests have been failing on the master branch as well as the PR:

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 8e8229b88b..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 132d7df7e9..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 99e5eaafdc..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 7aad20907e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 01509e4d27..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla 85058ad37e..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --ucla b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 7f41299cc7..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security] : Heap OOB Read in switch_to_next_file(), Rust inputfile array shorter than num_input_files

2 participants