Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.18585

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 8 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 83
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 146
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

about 5 hours ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published 4 days ago • 30
s1: Simple test-time scaling

Paper • 2501.19393 • Published 4 days ago • 72
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published 4 days ago • 17
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training

Paper • 2501.18965 • Published 5 days ago • 5

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 5 days ago • 46

Reasoning Models

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 5 days ago • 46

Large Language Models Think Too Fast To Explore Effectively

Paper • 2501.18009 • Published 6 days ago • 22
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 5 days ago • 46

RL+reason model

about 2 hours ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published 12 days ago • 21
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published 9 days ago • 24
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 7 days ago • 98
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Test-Time Compute/Optimal Scaling

Scaling LLM Inference with Optimized Sample Compute Allocation

Paper • 2410.22480 • Published Oct 29, 2024
Test-time Computing: from System-1 Thinking to System-2 Thinking

Paper • 2501.02497 • Published about 1 month ago • 41
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

Paper • 2412.14135 • Published Dec 18, 2024
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 27 days ago • 90

2025 LLM Papers on Hugging Face with Japanese Memos

about 13 hours ago

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Paper • 2501.02955 • Published 30 days ago • 40
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 99
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published 14 days ago • 81
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper • 2501.09781 • Published 19 days ago • 24

Video Creation by Demonstration

Paper • 2412.09551 • Published Dec 12, 2024 • 9
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 45
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published Dec 9, 2024 • 71
APOLLO: SGD-like Memory, AdamW-level Performance

Paper • 2412.05270 • Published Dec 6, 2024 • 38

On Memorization of Large Language Models in Logical Reasoning

Paper • 2410.23123 • Published Oct 30, 2024 • 18
LLMs Do Not Think Step-by-step In Implicit Reasoning

Paper • 2411.15862 • Published Nov 24, 2024 • 8
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 77
Deliberation in Latent Space via Differentiable Cache Augmentation

Paper • 2412.17747 • Published Dec 23, 2024 • 30

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs