-
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Paper • 2408.07199 • Published • 20 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72 -
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
Paper • 2406.12050 • Published • 18 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2404.03683
-
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72 -
V-STaR: Training Verifiers for Self-Taught Reasoners
Paper • 2402.06457 • Published • 8 -
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
Paper • 2406.12050 • Published • 18 -
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Paper • 2408.07199 • Published • 20
-
Can large language models explore in-context?
Paper • 2403.15371 • Published • 32 -
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 44 -
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 35 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 60
-
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper • 2403.18421 • Published • 22 -
Long-form factuality in large language models
Paper • 2403.18802 • Published • 24 -
stanford-crfm/BioMedLM
Text Generation • Updated • 2.9k • 395 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 48
-
One-step Diffusion with Distribution Matching Distillation
Paper • 2311.18828 • Published • 3 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 78 -
Condition-Aware Neural Network for Controlled Image Generation
Paper • 2404.01143 • Published • 11 -
Locating and Editing Factual Associations in GPT
Paper • 2202.05262 • Published • 1