Agentic
updated
GAIA: a benchmark for General AI Assistants
Paper
• 2311.12983
• Published
• 246
Viewer
• Updated
• 932 • 16.8k
• 618
Viewer
• Updated
• 253 • 3.67k
• 123
AppAgent: Multimodal Agents as Smartphone Users
Paper
• 2312.13771
• Published
• 54
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Paper
• 2401.01614
• Published
• 22
WebVoyager: Building an End-to-End Web Agent with Large Multimodal
Models
Paper
• 2401.13919
• Published
• 32
LARP: Language-Agent Role Play for Open-World Games
Paper
• 2312.17653
• Published
• 33
Viewer
• Updated
• 1.23k • 26.5k
• 78
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper
• 2402.01622
• Published
• 38
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for
Verifiers of Reasoning Chains
Paper
• 2402.00559
• Published
• 3
TradingAgents: Multi-Agents LLM Financial Trading Framework
Paper
• 2412.20138
• Published
• 19
RAG-Anything: All-in-One RAG Framework
Paper
• 2510.12323
• Published
• 68
PaperBanana: Automating Academic Illustration for AI Scientists
Paper
• 2601.23265
• Published
• 205