Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 614
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 195
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 177
AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Paper • 2405.14906 • Published May 23 • 23
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning Paper • 2303.15647 • Published Mar 28, 2023 • 4
view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell • Apr 28 • 37
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12 • 30
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 26
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Sep 18 • 206
SLIM Models Collection Structured Language Instruction Models (SLIMs) • 31 items • Updated 15 days ago • 30
zephyr-7b-sft-full-SPIN Collection Models fine-tuned with SPIN across iterations 0,1,2,3 • 4 items • Updated Feb 7 • 8