VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge Paper • 2504.10342 • Published 5 days ago • 9
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published 9 days ago • 69
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published 10 days ago • 143
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages Paper • 2503.23542 • Published 19 days ago • 10
Large Language Model Agent: A Survey on Methodology, Applications and Challenges Paper • 2503.21460 • Published 23 days ago • 75
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content Paper • 2503.16031 • Published 30 days ago • 3
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait Paper • 2503.12963 • Published Mar 17 • 7
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper • 2503.14478 • Published Mar 18 • 44
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 173
SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published Feb 20 • 97
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20 • 190
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 103
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles Paper • 2502.09082 • Published Feb 13 • 28