Skip to content

Cant run smoothqunat Q8 model! #33322

@aayejinpeng

Description

@aayejinpeng

version

OpenVINO version = 2025.4.0
OpenVINO.genai version = 2025.4.0.0

Help

I run the optimum-cli export openvino --model meta-llama/Llama-3.2-1B-Instruct --quant-mode int8 --dataset wikitext2 /home/perftest/perftool/openvino_2025.4/model/a8w8 get the smoothqunat q8 model at /home/perftest/perftool/openvino_2025.4/model/a8w8

and run optimum-cli export openvino --model meta-llama/Llama-3.2-1B-Instruct --weight-format int8 /home/perftest/perftool/openvino_2025.4/model/WOi8 get the weight-only q8 model at /home/perftest/perftool/openvino_2025.4/model/WOi8

i can run benchmark_genai with WOi8 model, but i cant run the a8w8 model

the error is:

python benchmark_genai.py -m /home/perftest/perftool/openvino_2025.4/model/a8w8
openvino runtime version: 2025.4.0-20398-7a975177ff4-releases/2025/4, genai version: 2025.4.0.0-2674-5041b1dc4e5
Traceback (most recent call last):
File "/home/perftest/perftool/openvino_2025.4/openvino.genai/samples/python/text_generation/benchmark_genai.py", line 85, in
main()
File "/home/perftest/perftool/openvino_2025.4/openvino.genai/samples/python/text_generation/benchmark_genai.py", line 53, in main
pipe = ov_genai.LLMPipeline(models_path, device, scheduler_config=scheduler_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Check 'unregistered_parameters.str().empty()' failed at src/core/src/model.cpp:267:
Model references undeclared parameters: opset1::Parameter beam_idx () -> (i32[?])
opset1::Parameter attention_mask () -> (i64[?,?])

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions