WebGames: Challenging General-Purpose Web-Browsing AI Agents Paper • 2502.18356 • Published 12 days ago • 11
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published 12 days ago • 67
Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis Paper • 2502.20383 • Published 10 days ago • 2
AI-Invented Tonal Languages: Preventing a Machine Lingua Franca Beyond Human Understanding Paper • 2503.01063 • Published 7 days ago • 5
When an LLM is apprehensive about its answers -- and when its uncertainty is justified Paper • 2503.01688 • Published 6 days ago • 19
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens Paper • 2502.18890 • Published 11 days ago • 23
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published 6 days ago • 30
DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion Paper • 2503.01183 • Published 6 days ago • 26
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published 6 days ago • 65
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents Paper • 2503.01935 • Published 6 days ago • 20
SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking Paper • 2503.00955 • Published 7 days ago • 25
Reliable and Efficient Multi-Agent Coordination via Graph Neural Network Variational Autoencoders Paper • 2503.02954 • Published 5 days ago • 3
Enhancing Abnormality Grounding for Vision Language Models with Knowledge Descriptions Paper • 2503.03278 • Published 4 days ago • 12
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs Paper • 2503.02003 • Published 6 days ago • 37