Clem πŸ€—'s picture

Clem πŸ€— PRO

clem

AI & ML interests

multi-modal, time-series, biology and chemistry

Recent Activity

reacted to merve's post with πŸ‘€ 2 days ago
Oof, what a week! πŸ₯΅ So many things have happened, let's recap! https://huggingface.co./collections/merve/jan-24-releases-6793d610774073328eac67a9 Multimodal πŸ’¬ - We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG πŸ’— - UI-TARS are new models by ByteDance to unlock agentic GUI control 🀯 in 2B, 7B and 72B - Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B - MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context - Dataset: Yale released a new benchmark called MMVU - Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark LLMs πŸ“– - DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🀯 - Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B - NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!) Audio πŸ—£οΈ - Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B - TangoFlux is a new audio generation model trained from scratch and aligned with CRPO Image/Video/3D Generation ⏯️ - Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux - tencent released Hunyuan3D-2, new 3D asset generation from images
reacted to merve's post with πŸ€— 2 days ago
Oof, what a week! πŸ₯΅ So many things have happened, let's recap! https://huggingface.co./collections/merve/jan-24-releases-6793d610774073328eac67a9 Multimodal πŸ’¬ - We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG πŸ’— - UI-TARS are new models by ByteDance to unlock agentic GUI control 🀯 in 2B, 7B and 72B - Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B - MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context - Dataset: Yale released a new benchmark called MMVU - Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark LLMs πŸ“– - DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🀯 - Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B - NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!) Audio πŸ—£οΈ - Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B - TangoFlux is a new audio generation model trained from scratch and aligned with CRPO Image/Video/3D Generation ⏯️ - Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux - tencent released Hunyuan3D-2, new 3D asset generation from images
reacted to merve's post with πŸ”₯ 2 days ago
Oof, what a week! πŸ₯΅ So many things have happened, let's recap! https://huggingface.co./collections/merve/jan-24-releases-6793d610774073328eac67a9 Multimodal πŸ’¬ - We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG πŸ’— - UI-TARS are new models by ByteDance to unlock agentic GUI control 🀯 in 2B, 7B and 72B - Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B - MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context - Dataset: Yale released a new benchmark called MMVU - Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark LLMs πŸ“– - DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🀯 - Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B - NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!) Audio πŸ—£οΈ - Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B - TangoFlux is a new audio generation model trained from scratch and aligned with CRPO Image/Video/3D Generation ⏯️ - Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux - tencent released Hunyuan3D-2, new 3D asset generation from images
View all activity

Organizations

Hugging Face's profile picture Objective Function's profile picture Pied Piper's profile picture Society & Ethics's profile picture Organization's profile picture Text Generation Inference's profile picture testifly's profile picture HugGAN Community's profile picture Hugging Face Fellows's profile picture Gradio-Blocks-Party's profile picture HuggingFaceM4's profile picture Open-Source AI Meetup's profile picture Hugging Face OSS Metrics's profile picture Hugging Face Smol Cluster's profile picture huggingPartyParis's profile picture Unofficial Mistral Community's profile picture Journalists on Hugging Face's profile picture Major TOM's profile picture MLX Community's profile picture Miami AI Hub's profile picture Social Post Explorers's profile picture Paris AI Running Club's profile picture Hugging Face for Legal's profile picture Hugging Face Party @ PyTorch Conference's profile picture Nerdy Face's profile picture open/ acc's profile picture Bluesky Community's profile picture

Posts 37