Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper ⢠2406.14491 ⢠Published Jun 20 ⢠86
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper ⢠2406.17557 ⢠Published Jun 25 ⢠87
KTO: Model Alignment as Prospect Theoretic Optimization Paper ⢠2402.01306 ⢠Published Feb 2 ⢠16
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize ⢠6 items ⢠Updated Jul 21 ⢠67
Large Language Models Can Self-Improve in Long-context Reasoning Paper ⢠2411.08147 ⢠Published Nov 12 ⢠62
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper ⢠2411.16489 ⢠Published 30 days ago ⢠40
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper ⢠2403.09629 ⢠Published Mar 14 ⢠75
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning Paper ⢠2406.12050 ⢠Published Jun 17 ⢠19
Top LLM Collection Collection of TOP Open Source LLM, Sort by Best on top ⢠6 items ⢠Updated Jul 26 ⢠13
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper ⢠2411.03562 ⢠Published Nov 5 ⢠63
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper ⢠2410.23743 ⢠Published Oct 31 ⢠59
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper ⢠2410.02707 ⢠Published Oct 3 ⢠47
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper ⢠2406.08464 ⢠Published Jun 12 ⢠65
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper ⢠2410.02884 ⢠Published Oct 3 ⢠52