Loubna Ben Allal's picture

Loubna Ben Allal

loubnabnl

·

https://loubnabnl.github.io/

AI & ML interests

SmolLMs, ML for code, data

Recent Activity

new activity 2 days ago

HuggingFaceTB/finemath:Why did you use CC rather than FineWeb to create FineMath?

updated a dataset 4 days ago

loubnabnl/mmlu-evals-smollm-360m

updated a dataset 4 days ago

loubnabnl/code_data

View all activity

Articles

SmolLM - blazingly fast and remarkably powerful

CodeGemma - an official Google release for code LLMs

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

StarCoder2 and The Stack v2

Code Llama: Llama 2 learns to code

StarCoder: A State-of-the-Art LLM for Code

How to train a Language Model with Megatron-LM

Organizations

loubnabnl's activity

upvoted a collection 29 days ago

SmolVLM

State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated 3 days ago • 30

upvoted an article 4 months ago

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22

• 86

upvoted a collection 4 months ago

💻 Local SmolLMs

SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos • 14 items • Updated 3 days ago • 46

upvoted an article 5 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 292

upvoted a paper 6 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25 • 87

upvoted a paper 7 months ago

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Paper • 2405.18392 • Published May 28 • 12

upvoted 2 collections 9 months ago

Leaderboards and benchmarks ✨

Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 83 items • Updated 8 days ago • 91

ZeroGPU Spaces

ZeroGPU Spaces made by the community • 17 items • Updated Jun 6 • 230

upvoted a paper 10 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 136

upvoted a collection 10 months ago

💫 StarCoder2

StarCoder2 models and datasets! • 8 items • Updated Mar 1 • 82

upvoted a paper about 1 year ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 123