LM (MLX) Collection State-Space-Model powered Language Models for Apple Silicon • 12 items • Updated 26 days ago • 4
DiffusionKit Collection Models, datasets and evaluations results for DiffusionKit: https://github.com/argmaxinc/DiffusionKit • 6 items • Updated 13 days ago • 3
WhisperKit Collection Models, datasets and evaluation results for WhisperKit: https://github.com/argmaxinc/WhisperKit • 4 items • Updated 17 days ago • 6
Enhancing Training Efficiency Using Packing with Flash Attention Paper • 2407.09105 • Published Jul 12 • 12
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14 • 41
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31 • 73
The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models Paper • 2404.05904 • Published Apr 8 • 7
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18 • 35
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Paper • 2311.06242 • Published Nov 10, 2023 • 77
MobileCLIP Models + DataCompDR Data Collection MobileCLIP: Mobile-friendly image-text models with SOTA zero-shot capabilities. DataCompDR: Improved datasets for training image-text SOTA models. • 22 items • Updated Jun 20 • 20
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10 • 64
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion Paper • 2406.03184 • Published Jun 5 • 18
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Aug 2 • 673
SD 2.x, Zero-terminal SNR Collection SD 2.x models with zero terminal SNR noise schedule. • 3 items • Updated Nov 3, 2023 • 3
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • May 22 • 24
INDUS: Effective and Efficient Language Models for Scientific Applications Paper • 2405.10725 • Published May 17 • 32
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Jul 31 • 133
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • Jun 23 • 32
Depth Anything Release Collection Depth Anything models, foundation models for monocular depth estimation, trained on 1.5 million labeled images and 62 million unlabeled images • 8 items • Updated Jan 26 • 8
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • Jun 4 • 68
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325
Specialized Language Models with Cheap Inference from Limited Domain Data Paper • 2402.01093 • Published Feb 2 • 45
Canonical models Collection This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace • 68 items • Updated Feb 13 • 13
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16 • 35
CTRL: A Conditional Transformer Language Model for Controllable Generation Paper • 1909.05858 • Published Sep 11, 2019 • 4
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models Paper • 2401.05252 • Published Jan 10 • 45
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 103