Skip to content

Problems about Inference on Video-MME #99

@RRooyyCChheenn

Description

@RRooyyCChheenn

Brilliant work on OneVision-Encoder! 🎉
I'm trying to reproduce the LLaVA-NeXT-Video evaluation results following the instructions in the README.
For video benchmarks (e.g., VideoMME), I ran:

TASKS="videomme" bash scripts/eval/eval_ov_encoder.sh

However, I noticed this line in the script:

MODEL_PATH="${MODEL_PATH:-trained_model/must_contain_llava_in_name}"

I've searched through the repository and the Hugging Face organization (lmms-lab-encoder / lmms-lab), but I couldn't find a released model checkpoint whose name contains "llava".
❓ Could you clarify:
Am I misunderstanding the evaluation workflow?
Or is the LLaVA-integrated checkpoint not yet publicly released?
Any guidance would be greatly appreciated! Thanks again for the amazing work. 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions