Elie Bakouch's picture

Elie Bakouch

eliebak

·

AI & ML interests

Training LLM's @ 🤗

Recent Activity

updated a Space about 16 hours ago

science/README

View all activity

Articles

Diving into MiniMax01 405B MoE

SmolVLM - small yet mighty Vision Language Model

SmolLM - blazingly fast and remarkably powerful

Organizations

Posts 1

Post

1223

Wow, impressive 340B model by nvidia with a nice permissive license! 🚀 The technical report is full of insights and seems to use a different learning rate schedule than cosine, probably a variant of WSD. Hope to get more info on that! 👀

nvidia/nemotron-4-340b-666b7ebaf1b3867caf2f1911

Collections 2

Papers 1

arxiv:2405.18392

models 12

eliebak/SmolLM-360M-Instruct-Q8_0-GGUF

Updated Aug 13, 2024 • 10

eliebak/the-tokenizer-v1.5

Updated Jul 4, 2024

eliebak/the-tokenizer-v2

Updated Jun 17, 2024

eliebak/wsd_124M_300B_fw

Text Generation • Updated Jun 11, 2024 • 119

eliebak/wsd_124M_300B_edu

Text Generation • Updated Jun 11, 2024 • 118

eliebak/wsd_124M_150B_edu

Text Generation • Updated Jun 11, 2024 • 119

eliebak/wsd_124M_150B_fw

Text Generation • Updated Jun 11, 2024 • 119

eliebak/cos_124M_150B_fw

Text Generation • Updated Jun 9, 2024 • 189

eliebak/cos_124M_150B_edu

Text Generation • Updated Jun 9, 2024 • 141

eliebak/debug-cos-100B

Text Generation • Updated Jun 8, 2024 • 133

datasets 3

eliebak/very-smollm-corpus

Viewer • Updated Sep 9, 2024 • 4.58M • 82 • 2

eliebak/Buzz_wo_chatml_format

Viewer • Updated Jun 25, 2024 • 31.2M • 154 • 1

eliebak/Buzz_chatml_format

Viewer • Updated Jun 15, 2024 • 31.2M • 179