Hugging Face

Enterprise
company
Verified
Activity Feed

AI & ML interests

The AI community building the future.

Recent Activity

nielsr  updated a dataset about 10 hours ago
huggingface/community-science-merged
lysandre  updated a dataset about 10 hours ago
huggingface/transformers-metadata
m-ric  updated a dataset 3 days ago
huggingface/documentation-images
View all activity

Articles

huggingface's activity

clem 
posted an update about 5 hours ago
view post
Post
504
AI is not a zero-sum game. Open-source AI is the tide that lifts all boats!
sayakpaul 
posted an update about 5 hours ago
view post
Post
375
We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more 🔥

5-6GBs for HunyuanVideo, sky is the limit 🌌 🤗
https://huggingface.co./blog/video_gen
AdinaY 
posted an update about 11 hours ago
view post
Post
786
🔥So many exciting releases coming from the Chinese community this month!
zh-ai-community/2025-january-6786b054f492fb223591269e

LLMs:
✨ Qwen2.5 -1M by Alibaba
Qwen/qwen25-1m-679325716327ec07860530ba
✨ InternLM3-8B-Instruct by Shanghai AI Lab
internlm/internlm3-8b-instruct
✨ MiniMax-Text-01 by MiniMax AI
MiniMaxAI/MiniMax-Text-01
✨ RWKV-7 by BlinkDL -- RNN + Transformer 👀
BlinkDL/rwkv-7-world
✨ DeepSeek-R1 by DeepSeek -- THE ONE 🙌
https://huggingface.co./deepseek-ai
✨ Baichuan-M1-14B by Baichuan - Medical 🩺
baichuan-inc/Baichuan-M1-14B-Base
✨ Qwen2.5-Math-PRM by Alibaba - Math 🔢
Qwen/Qwen2.5-Math-PRM-7B

Code:
✨ Tare by Bytedance
https://trae.ai

TTS:
✨ T2A-01-HD by MiniMax AI
https://hailuo.ai/audio
✨ LLaSA by HKUST Audio
HKUSTAudio/Llasa-3B

MLLM:
✨ Kimi k1.5 by Moonshot AI
https://kimi.ai
✨ MiniCPM-o-2_6 by OpenBMB
openbmb/MiniCPM-o-2_6
✨ Sa2VA-4B by ByteDance
ByteDance/Sa2VA-4B
✨ VideoLLaMA 3 by Alibaba DAMO
DAMO-NLP-SG/videollama3-678cdda9281a0e32fe79af15
✨ LLaVA-Mini by Chinese Academy of Sciences
ICTNLP/llava-mini-llama-3.1-8b
✨Hunyuan-7B by Tencent
tencent/Hunyuan-7B-Instruct
✨ Hunyuan 3D 2.0 by Tencent
tencent/Hunyuan3D-2
✨MiniMax-VL-01 by MiniMax AI - A non transformer based VLM 👀
MiniMaxAI/MiniMax-VL-01

Agent:
✨ UI-TARS by Bytedance
bytedance-research/UI-TARS-7B-SFT
✨ GLM-PC by Zhipu AI
https://cogagent.aminer.cn

Dataset:
✨ Fineweb-Edu-Chinese by Opencsg
opencsg/Fineweb-Edu-Chinese-V2.1
✨ Multimodal_textbook by Alibaba
DAMO-NLP-SG/multimodal_textbook
✨ MME-Finance by Hithink AI
  • 2 replies
·
lewtun 
posted an update 2 days ago
view post
Post
4927
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
  • 1 reply
·
merve 
posted an update 3 days ago
view post
Post
3569
Oof, what a week! 🥵 So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal 💬
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗
- UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs 📖
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio 🗣️
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation ⏯️
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images
·
clem 
posted an update 3 days ago
dylanebert 
posted an update 3 days ago
view post
Post
415
⚙️ Convert .ply to .splat

i've created a simple space to convert .ply gaussian splat files to .splat format

dylanebert/ply-to-splat
merve 
posted an update 3 days ago
view post
Post
1796
smolagents can see 🔥
we just shipped vision support to smolagents 🤗 agentic computers FTW

you can now:
💻 let the agent get images dynamically (e.g. agentic web browser)
📑 pass images at the init of the agent (e.g. chatting with documents, filling forms automatically etc)
with few LoC change! 🤯
you can use transformers models locally (like Qwen2VL) OR plug-in your favorite multimodal inference provider (gpt-4o, antrophic & co) 🤠

read our blog http://hf.co/blog/smolagents-can-see
m-ric 
posted an update 3 days ago
view post
Post
2154
Today we make the biggest release in smolagents so far: 𝘄𝗲 𝗲𝗻𝗮𝗯𝗹𝗲 𝘃𝗶𝘀𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹𝘀, 𝘄𝗵𝗶𝗰𝗵 𝗮𝗹𝗹𝗼𝘄𝘀 𝘁𝗼 𝗯𝘂𝗶𝗹𝗱 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹 𝘄𝗲𝗯 𝗯𝗿𝗼𝘄𝘀𝗶𝗻𝗴 𝗮𝗴𝗲𝗻𝘁𝘀! 🥳

Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.

The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !

Go try it out, it's the most cracked agentic stuff I've seen in a while 🤯 (well, along with OpenAI's Operator who beat us by one day)

For more detail, read our announcement blog 👉 https://huggingface.co./blog/smolagents-can-see
The code for the web browser example is here 👉 https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py
·
AdinaY 
posted an update 3 days ago
AdinaY 
posted an update 4 days ago
tomaarsen 
posted an update 4 days ago
view post
Post
1363
I just released Sentence Transformers v3.4.0, featuring a memory leak fix, compatibility between the powerful Cached... losses and the Matryoshka loss modifier, and a bunch of fixes & small features.

🪆 Matryoshka & Cached loss compatibility
It is now possible to combine the powerful Cached... losses (which use in-batch negatives & a caching mechanism to allow for endless batch size & negatives) with the Matryoshka loss modifier which modifies a base loss such that it is trained not only on the maximum dimensionality (e.g. 1024 dimensions), but also on many lower dimensions (e.g. 768, 512, 256, 128, 64, 32).
After training, these models' embeddings can be truncated for faster retrieval, etc.

🎞️ Resolve memory leak when Model and Trainer are reinitialized
Due to a circular dependency between Trainer -> Model -> ModelCardData -> Trainer, deleting both the trainer & model still didn't free up the memory.
This led to a memory leak in scripts where you repeatedly do so.

➕ New Features
Many new small features, e.g. multi-GPU support for 'mine_hard_negatives', a 'margin' parameter to TripletEvaluator, and Matthews Correlation Coefficient in the BinaryClassificationEvaluator.

🐛 Bug Fixes
Also a bunch of fixes, for example that subsequent batches were not sorted when using the "no_duplicates" batch sampler. See the release notes for more details.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.4.0

Big thanks to all community members who assisted in this release. 10 folks with their first contribution this time around!