Skip to content

Pull requests: THUDM/slime

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Support OPD when teacher tokenization differs
#2032 opened Jun 8, 2026 by hhnqqq Loading…
examples: add CISPO custom loss
#2026 opened Jun 6, 2026 by kekmodel Contributor Loading…
fix(rollout): apply sample filter in rollout manager
#2014 opened Jun 4, 2026 by EazyReal Loading…
fix(colocate): derive num_gpus_per_node from actor_num_gpus_per_node
#2012 opened Jun 3, 2026 by aoshen02 Contributor Loading…
perf(megatron-loss): scale logits per-chunk to avoid OOM
#2010 opened Jun 2, 2026 by Yangruipis Contributor Loading…
perf(rollout): pack loss_masks as np.int8 at the ray.put boundary
#2006 opened Jun 2, 2026 by Chasing1020 Contributor Loading…
[Draft] Refactor trajectory manager
#2005 opened Jun 2, 2026 by jingshenghang Collaborator Draft
fix(gpt-oss): emit fused raw expert tensors for SGLang
#2004 opened Jun 1, 2026 by Jiang020609 Contributor Loading…
fix(logging): partition raw rewards for correct samples
#1996 opened May 30, 2026 by Jiang020609 Contributor Loading…
[sft] rebuild the sft loss mask generator and add ci
#1994 opened May 30, 2026 by zhuzilin Contributor Loading…
Add timeout configuration for on policy distillation HTTP session.
#1970 opened May 28, 2026 by qqwqqw689 Contributor Loading…
Optimize CP sequence KL communication for GSPO/OPSM
#1948 opened May 26, 2026 by zzdeae86 Loading…
Add Trackio rollout trace logging
#1935 opened May 21, 2026 by abidlabs Loading…
feat: add SFT entropy logging and validation loss monitoring
#1925 opened May 19, 2026 by none0663 Contributor Loading…
fix: make OPSM reject whole off-policy sequences
#1917 opened May 18, 2026 by haoyang9804 Loading…
ProTip! Adding no:label will show everything without a label.