Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper โข 2604.06628 โข Published 4 days ago โข 189
view article Article Red Teaming with RL: Exploiting Tinker API for Harmful RL on 235B Model Jan 1 โข 19
Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate Paper โข 2410.07167 โข Published Oct 9, 2024 โข 39