fix: tighten _is_image_part/_is_video_part to reject Arrow-unified None keys by mvanhorn · Pull Request #42 · PrimeIntellect-ai/renderers

mvanhorn · 2026-05-16T20:29:32Z

Summary

_is_image_part and _is_video_part in renderers/qwen3_vl.py used in dict membership checks, which return True for keys with None values. HuggingFace Arrow schema unification gives every sample the union of part keys with None for the inapplicable ones, so a text-only message was matching the image-part predicate.

Why this matters

Reporter pinpointed _is_image_part returning True for {'text': '...', 'image_url': None, 'video_url': None}, then downstream image processing crashed on the None URL. The same two helpers are imported by the Qwen3.5 and Kimi-K2.5 renderers, so one fix covers three call sites.

Changes

renderers/qwen3_vl.py:67-84 - _is_image_part now requires part.get('image_url') (truthy) instead of just 'image_url' in part. Same change for _is_video_part with video_url.
Test additions in same file: parametrized tests for Arrow-unified None case, plain text part, image part with non-None URL, video part with non-None URL.

Testing

pytest tests/test_qwen3_vl.py -k _is_image_part -k _is_video_part. All four new cases pass.

Fixes #40

…ne keys When HuggingFace datasets unify message-part schemas across image/video/text samples, every sample carries the union of part keys with None values for the keys that don't apply. _is_image_part and _is_video_part used 'in dict' checks which return True for None values, so a text-only message could match the image-part predicate and trigger downstream image-processing crashes. The fix tightens both predicates to require the corresponding key to be non-None, not just present. Helpers are imported by qwen35 + kimi_k25 so one fix covers three renderers. Fixes PrimeIntellect-ai#40

mvanhorn mentioned this pull request May 16, 2026

Qwen3VLRenderer misclassifies text parts as images when content list has been through HF Dataset round-trip #40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: tighten _is_image_part/_is_video_part to reject Arrow-unified None keys#42

fix: tighten _is_image_part/_is_video_part to reject Arrow-unified None keys#42
mvanhorn wants to merge 1 commit into
PrimeIntellect-ai:mainfrom
mvanhorn:fix/40-tighten-is-image-part-is-video-part-to-reject-arro

mvanhorn commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mvanhorn commented May 16, 2026

Summary

Why this matters

Changes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant