Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 4 days ago • 189
view article Article Red Teaming with RL: Exploiting Tinker API for Harmful RL on 235B Model Jan 1 • 19
view article Article Red Teaming with RL: Exploiting Tinker API for Harmful RL on 235B Model Jan 1 • 19