-
Notifications
You must be signed in to change notification settings - Fork 703
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
refactor(proxy): split monolithic proxy into modular serve/proxy package
improvement
#4647
opened Jun 4, 2026 by
lvhan028
Collaborator
Loading…
support disaggregated weight update
planned feature
#4638
opened May 29, 2026 by
irexyc
Collaborator
Loading…
modify save model in lite module
improvement
#4624
opened May 26, 2026 by
43758726
Contributor
Loading…
feat(turbomind): support priority schedule policy
#4614
opened May 22, 2026 by
4mengy
Loading…
3 of 4 tasks
perf: optimize guided decoding with xgrammar upgrade, batched API, and async D2H overlap
#4605
opened May 21, 2026 by
windreamer
Collaborator
Loading…
1 of 4 tasks
[WIP]: Support reuse routed experts on eviction
#4599
opened May 19, 2026 by
RunningLeon
Collaborator
Loading…
docs(advance): add Add a New Speculative Decoding Method guide
documentation
Improvements or additions to documentation
#4589
opened May 17, 2026 by
SuperMarioYL
Loading…
4 tasks done
[security] fix(proxy): require auth for node management
#4579
opened May 11, 2026 by
Hinotoi-agent
Loading…
5 of 9 tasks
feat: configure cudagraph capture batch sizes
improvement
#4573
opened May 8, 2026 by
CUHKSZzxy
Collaborator
Loading…
Fix health latency under concurrent VL request preparation
Bug:P0
#4570
opened May 7, 2026 by
CUHKSZzxy
Collaborator
Loading…
[Feature] Add guided decoding support for speculative decoding
enhancement
New feature or request
#4559
opened Apr 28, 2026 by
windreamer
Collaborator
Loading…
4 tasks done
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.