David Berenstein's picture

David Berenstein

davidberenstein1957

AI & ML interests

Everything data

Recent Activity

Articles

Organizations

Hugging Face's profile picture SomosNLP's profile picture Tools's profile picture Webhooks Explorers (BETA)'s profile picture Argilla's profile picture Blog-explorers's profile picture Hugging Face TB Research's profile picture Argilla Explorers's profile picture distilabel-internal-testing's profile picture Data Is Better Together's profile picture Social Post Explorers's profile picture argilla-internal-testing's profile picture Dataset Viber's profile picture Argilla Warehouse's profile picture Dataset Tools's profile picture Uplimit's profile picture Data Is Better Together Contributor's profile picture FeeL (Feedback Loop)'s profile picture AI Blueprint's profile picture

davidberenstein1957's activity

posted an update about 9 hours ago
posted an update 6 days ago
replied to their post 8 days ago
view reply

@djuna and @xzuyn , thanks for the follow-up, and I agree with the approach. I have even tried the exact same approach. However, when generating using the code above, I don't get proper results with <|begin▁of▁sentence|><|User|>, but <|begin▁of▁sentence|> User: does work.

reacted to fdaudens's post with 🔥❤️ 8 days ago
view post
Post
7940
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5M—nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. 🚀

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version — 1M downloads alone.
·
replied to their post 8 days ago
reacted to AdinaY's post with 🧠🔥 9 days ago
view post
Post
2577
🔥So many exciting releases coming from the Chinese community this month!
zh-ai-community/2025-january-6786b054f492fb223591269e

LLMs:
✨ Qwen2.5 -1M by Alibaba
Qwen/qwen25-1m-679325716327ec07860530ba
✨ InternLM3-8B-Instruct by Shanghai AI Lab
internlm/internlm3-8b-instruct
✨ MiniMax-Text-01 by MiniMax AI
MiniMaxAI/MiniMax-Text-01
✨ RWKV-7 by BlinkDL -- RNN + Transformer 👀
BlinkDL/rwkv-7-world
✨ DeepSeek-R1 by DeepSeek -- THE ONE 🙌
https://huggingface.co./deepseek-ai
✨ Baichuan-M1-14B by Baichuan - Medical 🩺
baichuan-inc/Baichuan-M1-14B-Base
✨ Qwen2.5-Math-PRM by Alibaba - Math 🔢
Qwen/Qwen2.5-Math-PRM-7B

Code:
✨ Tare by Bytedance
https://trae.ai

TTS:
✨ T2A-01-HD by MiniMax AI
https://hailuo.ai/audio
✨ LLaSA by HKUST Audio
HKUSTAudio/Llasa-3B

MLLM:
✨ Kimi k1.5 by Moonshot AI
https://kimi.ai
✨ MiniCPM-o-2_6 by OpenBMB
openbmb/MiniCPM-o-2_6
✨ Sa2VA-4B by ByteDance
ByteDance/Sa2VA-4B
✨ VideoLLaMA 3 by Alibaba DAMO
DAMO-NLP-SG/videollama3-678cdda9281a0e32fe79af15
✨ LLaVA-Mini by Chinese Academy of Sciences
ICTNLP/llava-mini-llama-3.1-8b
✨Hunyuan-7B by Tencent
tencent/Hunyuan-7B-Instruct
✨ Hunyuan 3D 2.0 by Tencent
tencent/Hunyuan3D-2
✨MiniMax-VL-01 by MiniMax AI - A non transformer based VLM 👀
MiniMaxAI/MiniMax-VL-01

Agent:
✨ UI-TARS by Bytedance
bytedance-research/UI-TARS-7B-SFT
✨ GLM-PC by Zhipu AI
https://cogagent.aminer.cn

Dataset:
✨ Fineweb-Edu-Chinese by Opencsg
opencsg/Fineweb-Edu-Chinese-V2.1
✨ Multimodal_textbook by Alibaba
DAMO-NLP-SG/multimodal_textbook
✨ MME-Finance by Hithink AI
·
posted an update 9 days ago
posted an update 14 days ago
reacted to AdinaY's post with 🚀 14 days ago
view post
Post
2931
What happened yesterday in the Chinese AI community? 🚀

T2A-01-HD 👉 https://hailuo.ai/audio
MiniMax's Text-to-Audio model, now in Hailuo AI, offers 300+ voices in 17+ languages and instant emotional voice cloning.

Tare 👉 https://www.trae.ai/
A new coding tool by Bytedance for professional developers, supporting English & Chinese with free access to Claude 3.5 and GPT-4 for a limited time.

DeepSeek-R1 Series 👉 deepseek-ai/deepseek-r1-678e1e131c0169c0bc89728d
Open-source reasoning models with MIT license by DeepSeek.

Kimi K 1.5 👉 https://github.com/MoonshotAI/Kimi-k1.5 | https://kimi.ai/
An O1-level multi-modal model by MoonShot AI, utilizing reinforcement learning with long and short-chain-of-thought and supporting up to 128k tokens.

And today…

Hunyuan 3D-2.0 👉 tencent/Hunyuan3D-2
A SoTA 3D synthesis system for high-res textured assets by Tencent Hunyuan , with open weights and code!

Stay tuned for more updates 👉 https://huggingface.co./zh-ai-community
reacted to fdaudens's post with ❤️ 15 days ago
view post
Post
1812
Reminder: Don’t. Use. ChatGPT. As. A. Calculator. Seriously. 🤖

Loved listening to @sasha on Hard Fork—it really made me think.

A few takeaways that hit home:
- Individual culpability only gets you so far. The real priority: demanding accountability and transparency from companies.
- Evaluate if generative AI is the right tool for certain tasks (like search) before using it.

Curious about the full conversation? https://www.nytimes.com/2025/01/17/podcasts/hardfork-tiktok-rednote-environment.html. Give it a listen—it’s worth it! 🌍
  • 1 reply
·
reacted to fdaudens's post with 👍 15 days ago
view post
Post
1388
🔍 From instruction-following to creative storytelling, dive into 2024's most impactful AI datasets! These gems are shaping everything from scientific research to video understanding.

Check it out: huggingface/open-source-ai-year-in-review-2024
reacted to meg's post with 🔥 15 days ago
posted an update 18 days ago
reacted to burtenshaw's post with 🚀 19 days ago
reacted to ariG23498's post with 🚀 19 days ago
reacted to Tonic's post with 🔥 19 days ago
view post
Post
1827
🙋🏻‍♂️ Hey there folks ,

Facebook AI just released JASCO models that make music stems .

you can try it out here : Tonic/audiocraft

hope you like it
reacted to mlabonne's post with 🤗 19 days ago
view post
Post
3903
🆕 LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

💻 LLM Course: https://huggingface.co./blog/mlabonne/llm-course