Running 2.15k 2.15k The Ultra-Scale Playbook ๐ The ultimate guide to training LLM on large GPU Clusters
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper โข 2502.05171 โข Published about 1 month ago โข 122
Byte Latent Transformer: Patches Scale Better Than Tokens Paper โข 2412.09871 โข Published Dec 13, 2024 โข 93
Running on CPU Upgrade 12.7k 12.7k Open LLM Leaderboard ๐ Track, rank and evaluate open LLMs and chatbots