Running 2.21k 2.21k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning Text Generation • Updated 20 days ago • 2.15k • 51
Breeze 2 Family Collection Llama-Breeze2 is a multi-modal language model family specifically intended for Traditional Chinese use. BreezyVoice is a Taiwan Mandarin TTS • 6 items • Updated 14 days ago • 18
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback Paper • 2501.10799 • Published Jan 18 • 15
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference Jan 16 • 71