lihaoxin2020/qwen3-4B-refiner-3201-rl-balanced-step50 Text Generation • 196k • Updated 20 days ago • 12 • 1
lihaoxin2020/qwen3-4B-refiner-3201-rl-balanced-step100 Text Generation • 196k • Updated 19 days ago • 154
lihaoxin2020/qwen3-4B-refiner-sft-rl-balanced-step50 Text Generation • 196k • Updated 19 days ago • 205
lihaoxin2020/qwen3-4B-refiner-sft-rl-balanced-resume-step100 Text Generation • 196k • Updated 18 days ago • 192
lihaoxin2020/qwen3-4b-refiner-gpt54-instance-rubric-gpt54-grpo-step50 Text Generation • 196k • Updated 12 days ago • 316