| 48861 |
acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-instruct-grpo-69k-sys12-mtrl-d1fo-280-step |
HuggingFace |
1.30 |
unrated |
| 48862 |
acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-instruct-grpo-69k-sys12-mtrl-d1fo-535-step |
HuggingFace |
1.30 |
unrated |
| 48863 |
acecoder-fsdp_agent-xiaomimimo_mimo-7b-base-grpo-n16-b128-t1.0-lr1e-6-69k-2turn-sys4-110-step |
HuggingFace |
1.30 |
unrated |
| 48864 |
acecoder-fsdp_agent-xiaomimimo_mimo-7b-base-grpo-n16-b128-t1.0-lr1e-6-69k-2turn-sys4-120-step |
HuggingFace |
1.30 |
unrated |
| 48865 |
torl-fsdp_agent-qwen_qwen2.5-coder-1.5b-grpo-n16-b128-t1.0-lr1e-6new-no-toolusepenalty-430-step |
HuggingFace |
1.30 |
unrated |
| 48866 |
torl-fsdp_agent-qwen_qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6new-no-toolusepenalty-360-step |
HuggingFace |
1.30 |
unrated |
| 48867 |
torl-fsdp_agent-qwen_qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6torl_same_train-310-step |
HuggingFace |
1.30 |
unrated |
| 48868 |
torl-fsdp_agent-qwen_qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6v2-reproduce-430-step |
HuggingFace |
1.30 |
unrated |
| 48869 |
oNo-1-Qwen3-235B-A22B-Thinking-MedMCQA-swift-gspo-sparse-rewards |
HuggingFace |
1.30 |
信号不足 |
| 48870 |
oNo-1-Qwen3-235B-A22B-Thinking-merged-difficult-MedMCQA-10-swift-gspo-dense-rewards |
HuggingFace |
1.30 |
信号不足 |
| 48871 |
oNo-1-Qwen3-235B-A22B-Thinking-merged-difficult-MedMCQA-10-swift-gspo-sparse-rewards |
HuggingFace |
1.30 |
信号不足 |
| 48872 |
qwen3-235b-a22b-thinking-merged-20250818-medmcqa-n10-bf16 |
HuggingFace |
1.30 |
信号不足 |
| 48873 |
qwen3-235b-a22b-thinking-merged-20250818-medmcqa-n10-kv0-bf16 |
HuggingFace |
1.30 |
信号不足 |
| 48874 |
qwen3-235b-a22b-thinking-merged-kujira-v2.1-messages-dft-bf16 |
HuggingFace |
1.30 |
信号不足 |
| 48875 |
qwen3-235b-a22b-thinking-merged-kujira-v2.1-messages-kv-bf16 |
HuggingFace |
1.30 |
信号不足 |
| 48876 |
qwen3-235b-a22b-thinking-merged-kujira-v2.1-messages-sft-bf16 |
HuggingFace |
1.30 |
信号不足 |
| 48877 |
qwen3-235b-a22b-thinking-merged-medmcqa-100samples-grpo-n10-bf16 |
HuggingFace |
1.30 |
信号不足 |
| 48878 |
qwen3-235b-a22b-thinking-merged-medmcqa-100samples-gspo-bf16 |
HuggingFace |
1.30 |
信号不足 |
| 48879 |
qwen3-235b-a22b-thinking-merged-megascience-textbookreasoning-500-bft-bf16 |
HuggingFace |
1.30 |
信号不足 |
| 48880 |
qwen3_235b_a22b_thinking_textbookreasoning_ugphysics_aops_mini |
HuggingFace |
1.30 |
信号不足 |