rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper β’ 2501.04519 β’ Published 5 days ago β’ 197
Scaling Laws for Floating Point Quantization Training Paper β’ 2501.02423 β’ Published 8 days ago β’ 23
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper β’ 2501.03262 β’ Published 9 days ago β’ 72
view article Article πΊπ¦ββ¬ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark By wolfram β’ 11 days ago β’ 37
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment Paper β’ 2412.19326 β’ Published 18 days ago β’ 18
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper β’ 2412.18319 β’ Published 20 days ago β’ 35
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing Paper β’ 2412.14711 β’ Published 25 days ago β’ 15
MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design Paper β’ 2412.14590 β’ Published 25 days ago β’ 13
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents Paper β’ 2412.13194 β’ Published 27 days ago β’ 12
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper β’ 2408.03314 β’ Published Aug 6, 2024 β’ 54
Smaller Language Models Are Better Instruction Evolvers Paper β’ 2412.11231 β’ Published 29 days ago β’ 27
Solving math word problems with process- and outcome-based feedback Paper β’ 2211.14275 β’ Published Nov 25, 2022 β’ 8
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper β’ 2412.10360 β’ Published about 1 month ago β’ 137
JuStRank: Benchmarking LLM Judges for System Ranking Paper β’ 2412.09569 β’ Published Dec 12, 2024 β’ 19
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper β’ 2412.05271 β’ Published Dec 6, 2024 β’ 125
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper β’ 2412.04424 β’ Published Dec 5, 2024 β’ 59
Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages Paper β’ 2411.12240 β’ Published Nov 19, 2024 β’ 6