aloobun's picture

aloobun

aloobun

·

AI & ML interests

tiny models and datasets

Recent Activity

liked a model 12 days ago

hexgrad/Kokoro-82M

liked a dataset 15 days ago

QuasarResearch/apollo-preview-v0.2

upvoted a paper 17 days ago

Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus

View all activity

Organizations

aloobun's activity

upvoted a paper 17 days ago

Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus

Paper • 2410.14815 • Published Oct 18, 2024 • 1

upvoted a paper 24 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 26 days ago • 121

upvoted a collection 24 days ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 25 days ago • 123

upvoted a collection 27 days ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 5 days ago • 78

upvoted a collection about 1 month ago

MiniPLM

Pre-trained models in MiniPLM: Knowledge Distillation for Pre-Training Language Models • 5 items • Updated Oct 21, 2024 • 2

upvoted 2 papers about 1 month ago

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 14

Structured 3D Latents for Scalable and Versatile 3D Generation

Paper • 2412.01506 • Published Dec 2, 2024 • 51

upvoted a collection about 1 month ago

InternVL2.5

Better than InternVL 2.0 • 18 items • Updated 3 days ago • 79

upvoted a collection about 2 months ago

H2O Danube3

7 items • Updated Nov 30, 2024 • 56

upvoted a paper about 2 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 43

upvoted 2 collections 2 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 22 days ago • 198

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 101

upvoted a collection 5 months ago

Parler-TTS: fully open-source high-quality TTS

If you want to find out more about how these models were trained and even fine-tune them yourself, check-out the Parler-TTS repository on GitHub. • 8 items • Updated Dec 2, 2024 • 49

upvoted a collection 7 months ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 354

upvoted a collection 8 months ago

Yi-1.5 (2024/05)

10 items • Updated May 20, 2024 • 92

upvoted 2 collections 10 months ago

Augmentable

A collection of datasets that should be augmented further with gpt-4 • 13 items • Updated Jan 2, 2024 • 4

Transformers compatible Mamba

This release includes the `mamba` repositories compatible with the `transformers` library • 5 items • Updated Mar 6, 2024 • 37