Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 13 days ago • 75
Fineweb-Edu-Ar Collection Largest (as of 2024) machine translated Arabic educational corpus • 2 items • Updated 10 days ago • 1
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs Paper • 2412.08347 • Published 15 days ago • 4
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards Paper • 2402.01781 • Published Feb 1 • 1
Fineweb-Edu-Ar: Machine-translated Corpus to Support Arabic Small Language Models Paper • 2411.06402 • Published Nov 10 • 2
SmolTulu Collection A collection of models that use SmolLM2 as the pretrained base in conjunction with AllenAI's Tulu 3 post training pipeline. • 6 items • Updated 9 days ago • 1
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 28 days ago • 62