DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15 • 52
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 20 days ago • 636
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23 • 36
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 119
DiJiang: Efficient Large Language Models through Compact Kernelization Paper • 2403.19928 • Published Mar 29 • 10
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Paper • 2403.14624 • Published Mar 21 • 51
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference Paper • 2403.14520 • Published Mar 21 • 33
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework Paper • 2403.13248 • Published Mar 20 • 78
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 63
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper • 2403.09629 • Published Mar 14 • 75
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 125