-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
我的代码是:
from swift.llm import SglangEngine
engine = SglangEngine(model_path)
报错:
[INFO:swift] Setting ROOT_IMAGE_DIR: None. You can adjust this hyperparameter through the environment variable: ROOT_IMAGE_DIR.
[INFO:swift] Setting QWENVL_BBOX_FORMAT: legacy. You can adjust this hyperparameter through the environment variable: QWENVL_BBOX_FORMAT.
INFO 12-18 14:04:04 init.py:207] Automatically detected platform cuda.
INFO 12-18 14:04:04 init.py:207] Automatically detected platform cuda.
WARNING:sglang.srt.server_args:Attention backend not explicitly specified. Use flashinfer backend by default.
INFO 12-18 14:04:07 init.py:207] Automatically detected platform cuda.
INFO 12-18 14:04:07 init.py:207] Automatically detected platform cuda.
INFO 12-18 14:04:07 init.py:207] Automatically detected platform cuda.
INFO 12-18 14:04:07 init.py:207] Automatically detected platform cuda.
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
Loading safetensors checkpoint shards: 0% Completed | 0/2 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 50% Completed | 1/2 [00:00<00:00, 1.87it/s]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:00<00:00, 2.63it/s]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:00<00:00, 2.48it/s]
Capturing batches (bs=1 avail_mem=3.94 GB): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 12.86it/s]
Running FULL PARSE ...
0%| | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/chenboyang/Document/OCRFlux-main/OCRFlux-main-new/ocrflux/all_swift.py", line 164, in parse
resp_list = engine.infer(infer_reqs, request_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/conda_envs/sft_sglang/lib/python3.11/site-packages/swift/llm/infer/infer_engine/sglang_engine.py", line 183, in infer
return super().infer(infer_requests, request_config, metrics, template=template, use_tqdm=use_tqdm)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/conda_envs/sft_sglang/lib/python3.11/site-packages/swift/llm/infer/infer_engine/infer_engine.py", line 191, in infer
return self._batch_infer_stream(tasks, request_config.stream, use_tqdm, metrics)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/conda_envs/sft_sglang/lib/python3.11/site-packages/swift/llm/infer/infer_engine/infer_engine.py", line 147, in _batch_infer_stream
return loop.run_until_complete(self.batch_run(new_tasks))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "/data/conda_envs/sft_sglang/lib/python3.11/site-packages/swift/llm/infer/infer_engine/infer_engine.py", line 115, in batch_run
return await asyncio.gather(*tasks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/conda_envs/sft_sglang/lib/python3.11/site-packages/swift/llm/infer/infer_engine/infer_engine.py", line 132, in _new_run
res = await task
^^^^^^^^^^
File "/data/conda_envs/sft_sglang/lib/python3.11/site-packages/swift/llm/infer/infer_engine/sglang_engine.py", line 219, in infer_async
return await self._infer_full_async(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/conda_envs/sft_sglang/lib/python3.11/site-packages/swift/llm/infer/infer_engine/sglang_engine.py", line 237, in _infer_full_async
output = await self.engine.async_generate(**engine_inputs, sampling_params=generation_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Engine.async_generate() got an unexpected keyword argument 'images'
0%| | 0/2 [00:00<?, ?it/s]
None
FULL PARSE DONE
/data/conda_envs/sft_sglang/lib/python3.11/multiprocessing/resource_tracker.py:123: UserWarning: resource_tracker: process died unexpectedly, relaunching. Some resources might leak.
warnings.warn('resource_tracker: process died unexpectedly, '
Traceback (most recent call last):
File "/data/conda_envs/sft_sglang/lib/python3.11/multiprocessing/resource_tracker.py", line 239, in main
cache[rtype].remove(name)
KeyError: '/mp-340rzebv'
Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
在3090上,
sglang 0.5.6.post2; triton==2.1.0,nvidia-cudnn-cu12 9.16.0.29,torch 2.9.1,ms_swift 3.11.1
Additional context
Add any other context about the problem here(在这里补充其他信息)