SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models Paper • 2602.04208 • Published 27 days ago • 19
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22, 2025 • 90 • 6