Elie Bakouch

eliebak

AI & ML interests

Training LLM's @ πŸ€—

Recent Activity

updated a Space about 16 hours ago
science/README
View all activity

Articles

Organizations

Hugging Face's profile picture HuggingFaceBR4's profile picture Hugging Face H4's profile picture Blog-explorers's profile picture Hugging Face TB Research's profile picture huggingPartyParis's profile picture Nanotron Research's profile picture MLX Community's profile picture Hugging Face SMOL's profile picture HuggingFaceFW's profile picture HuggingFaceFW-Dev's profile picture LLHF's profile picture llmc's profile picture SLLHF's profile picture Argilla Warehouse's profile picture nltpt's profile picture smol-explorers's profile picture Open Science's profile picture Hugging Face Science's profile picture open/ acc's profile picture

Posts 1

view post
Post
1223
Wow, impressive 340B model by nvidia with a nice permissive license! πŸš€ The technical report is full of insights and seems to use a different learning rate schedule than cosine, probably a variant of WSD. Hope to get more info on that! πŸ‘€

nvidia/nemotron-4-340b-666b7ebaf1b3867caf2f1911