VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge Paper β’ 2504.10342 β’ Published 5 days ago β’ 9
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper β’ 2504.07096 β’ Published 9 days ago β’ 69
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper β’ 2504.06263 β’ Published 10 days ago β’ 143
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages Paper β’ 2503.23542 β’ Published 19 days ago β’ 10
Large Language Model Agent: A Survey on Methodology, Applications and Challenges Paper β’ 2503.21460 β’ Published 23 days ago β’ 75
Inside-Out: Hidden Factual Knowledge in LLMs Paper β’ 2503.15299 β’ Published about 1 month ago β’ 53
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content Paper β’ 2503.16031 β’ Published 30 days ago β’ 3
Running on Zero 885 885 InfiniteYou-FLUX πΈ Flexible Photo Recrafting While Preserving Your Identity
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait Paper β’ 2503.12963 β’ Published Mar 17 β’ 7
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper β’ 2503.14478 β’ Published Mar 18 β’ 44
API Agents vs. GUI Agents: Divergence and Convergence Paper β’ 2503.11069 β’ Published Mar 14 β’ 35
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper β’ 2502.15007 β’ Published Feb 20 β’ 173