louisbrulenaudet (Louis Brulé Naudet)

reacted to clem's post with 🔥 17 days ago

Post

2815

1 reply

·

reacted to davanstrien's post with 👍 19 days ago

Post

1358

Made some significant updates to my 🤗 semantic datasets search app. If you love falling into a wiki black hole, you might like this...

https://huggingface.co./spaces/librarian-bots/huggingface-datasets-semantic-search

reacted to m-ric's post with 👀 19 days ago

Post

2884

𝗚𝗿𝗲𝗮𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗮𝗹𝗲𝗿𝘁: you can now share agents to the Hub! 🥳🥳

And any agent pushed to Hub get a cool Space interface to directly chat with it.

This was a real technical challenge: for instance, serializing tools to export them meant that you needed to get all the source code for a tool, verify that it was standalone (not relying on external variables), and gathering all the packages required to make it run.

Go try it out! 👉 https://github.com/huggingface/smolagents

2 replies

·

reacted to merve's post with 👍 19 days ago

Post

4690

Your weekly recap of open AI is here, and it's packed with models! merve/feb-14-releases-67af876b404cc27c6d837767

👀 Multimodal
> OpenGVLab released InternVideo 2.5 Chat models, new video LMs with long context
> AIDC released Ovis2 model family along with Ovis dataset, new vision LMs in different sizes (1B, 2B, 4B, 8B, 16B, 34B), with video and OCR support
> ColQwenStella-2b is a multilingual visual retrieval model that is sota in it's size
> Hoags-2B-Exp is a new multilingual vision LM with contextual reasoning, long context video understanding

💬 LLMs
A lot of math models!
> Open-R1 team released OpenR1-Math-220k large scale math reasoning dataset, along with Qwen2.5-220K-Math fine-tuned on the dataset, OpenR1-Qwen-7B
> Nomic AI released new Nomic Embed multilingual retrieval model, a MoE with 500 params with 305M active params, outperforming other models
> DeepScaleR-1.5B-Preview is a new DeepSeek-R1-Distill fine-tune using distributed RL on math
> LIMO is a new fine-tune of Qwen2.5-32B-Instruct on Math

🗣️ Audio
> Zonos-v0.1 is a new family of speech recognition models, which contains the model itself and embeddings

🖼️ Vision and Image Generation
> We have ported DepthPro of Apple to transformers for your convenience!
> illustrious-xl-v1.0 is a new illustration generation model

3 replies

·

reacted to fffiloni's post with 🔥 19 days ago

Post

5200

I was thinking i need to step up my game on training Flux LoRas models, time to have some fun ! ☀️

Expect a new drop per week on aesthetics that catched my attention, here are 3 of them that worked really well !

fffiloni/cute-comic-800
fffiloni/carbo-800
fffiloni/oniric-750

reacted to clem's post with 🔥 19 days ago

Post

3477

We crossed 1B+ tokens routed to inference providers partners on HF, that we released just a few days ago.

Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.

Have you been using any integration and how can we make it better?

https://huggingface.co./blog/inference-providers

reacted to m-ric's post with 🚀 19 days ago

Post

2999

Less is More for Reasoning (LIMO): a 32B model fine-tuned with 817 examples can beat o1-preview on math reasoning! 🤯

Do we really need o1's huge RL procedure to see reasoning emerge? It seems not.
Researchers from Shanghai Jiaotong University just demonstrated that carefully selected examples can boost math performance in large language models using SFT —no huge datasets or RL procedures needed.

Their procedure allows Qwen2.5-32B-Instruct to jump from 6.5% to 57% on AIME and from 59% to 95% on MATH, while using only 1% of the data in previous approaches.

⚡ The Less-is-More Reasoning Hypothesis:
‣ Minimal but precise examples that showcase optimal reasoning patterns matter more than sheer quantity
‣ Pre-training knowledge plus sufficient computational resources at inference levels up math skills

➡️ Core techniques:
‣ High-quality reasoning chains with self-verification steps
‣ 817 handpicked problems that encourage deeper reasoning
‣ Enough inference-time computation to allow extended reasoning

💪 Efficiency gains:
‣ Only 817 examples instead of 100k+
‣ 40.5% absolute improvement across 10 diverse benchmarks, outperforming models trained on 100x more data

This really challenges the notion that SFT leads to memorization rather than generalization! And opens up reasoning to GPU-poor researchers 🚀

Read the full paper here 👉 LIMO: Less is More for Reasoning (2502.03387)

reacted to fdaudens's post with ❤️ 19 days ago

Post

5802

🎯 Perplexity drops their FIRST open-weight model on Hugging Face: A decensored DeepSeek-R1 with full reasoning capabilities. Tested on 1000+ examples for unbiased responses.

Check it out: perplexity-ai/r1-1776
Blog post: https://perplexity.ai/hub/blog/open-sourcing-r1-1776

1 reply

·

posted an update 21 days ago

Post

3103

I am pleased to introduce my first project built upon Hugging Face’s smolagents framework, integrated with Alpaca for financial market analysis automation 🦙🤗

The project implements technical indicators such as the Relative Strength Index (RSI) and Bollinger Bands to provide momentum and volatility analysis. Market data is retrieved through the Alpaca API, enabling access to historical price information across various timeframes.

AI-powered insights are generated using Hugging Face’s inference API, facilitating the analysis of market trends through natural language processing with DuckDuckGo search integration for real-time sentiment analysis based on financial news 🦆

Link to the GitHub project: https://github.com/louisbrulenaudet/agentic-market-tool

reacted to ImranzamanML's post with 😎 26 days ago

Post

3187

Hugging Face just launched the AI Agents Course – a free journey from beginner to expert in AI agents!

- Learn AI Agent fundamentals, use cases and frameworks
- Use top libraries like LangChain & LlamaIndex
- Compete in challenges & earn a certificate
- Hands-on projects & real-world applications

https://huggingface.co./learn/agents-course/unit0/introduction

You can join for a live Q&A on Feb 12 at 5PM CET to learn more about the course here

https://www.youtube.com/live/PopqUt3MGyQ

reacted to m-ric's post with 🚀 about 2 months ago

Post

2547

𝗪𝗲'𝘃𝗲 𝗷𝘂𝘀𝘁 𝗿𝗲𝗹𝗲𝗮𝘀𝗲𝗱 𝘀𝗺𝗼𝗹𝗮𝗴𝗲𝗻𝘁𝘀 𝘃𝟭.𝟯.𝟬 🚀, and it comes with a major feature: you can now log agent runs using OpenTelemetry to inspect them afterwards! 📊

This interactive format is IMO much easier to inspect big multi-step runs than endless console logs.

The setup is very easy, in a few lines of code.

Find a tutorial here 👉 https://huggingface.co./docs/smolagents/tutorials/inspect_runs

5 replies

·

reacted to MonsterMMORPG's post with 🔥 about 2 months ago

Post

4443

It is now possible to generate 16 Megapixel (4096x4096) raw images with SANA 4K model using under 8GB VRAM, 4 Megapixel (2048x2048) images using under 6GB VRAM, and 1 Megapixel (1024x1024) images using under 4GB VRAM thanks to new optimizations

13 January 2024 Update

Installers : https://www.patreon.com/posts/from-nvidia-labs-116474081

New 4K Tutorial Video : https://youtu.be/GjENQfHF4W8

Now the APP will use Diffusers Pipeline and it has huge VRAM optimizations

You need to reinstall

The models will be downloaded into your Hugging Face cache folder when you first time generate something

How to Get Installation Logs and How to Change Hugging Face Cache Folder :
https://www.patreon.com/posts/108419878

Please make a fresh install

When you enable all 4 optimizations the VRAM usages are like below

Make sure shared VRAM is enabled because initial loading of the model need more VRAM

Enable VAE Tiling + Enable VAE Slicing + Enable Model CPU Offload +
Enable Sequential CPU Offload

1K (1024x1024) : 4 GB GPUs
2K (2048x2048) : 6 GB GPUs
4K (4096x4096) : 8 GB GPUs

Still in any case may work on your GPU test it

Just Enable VAE Tiling + Enable Model CPU Offload works great in many cases

All below attached images are generated via SANA 4K model, they are RAW and their resolution is 5376x3072

Official repo page : https://github.com/NVlabs/Sana

2 replies

·

reacted to anakin87's post with ❤️ 3 months ago

Post

1661

Tulu 3 SFT Mixture by AllenAI is a massive, good, multilingual dataset for fine-tuning Language Models.

Unfortunately, it was missing the "language" column.

I added it using the good old fastText.

Check out the dataset here 👉 anakin87/tulu-3-sft-mixture-with-language

1 reply

·

reacted to Jaward's post with 🧠 3 months ago

Post

2442

Implements compute-efficient DeepPCR algorithm which parallelizes sequential operations thus speeding up inference and training of neural networks. DeepPCR can significantly reduce the time complexity in operations such as denoising in latent diffusion space from O(L) to O(log2 L).

Code: https://github.com/Jaykef/ai-algorithms/blob/main/deep_pcr.ipynb

reacted to prithivMLmods's post with 🔥 3 months ago

Post

3325

HF Posts Receipts 🏆🚀

[ HF POSTS RECEIPT ] : prithivMLmods/HF-POSTS-RECEIPT

🥠The one thing that needs to be remembered is the 'username'.

🥠And yeah, thank you, @maxiw , for creating the awesome dataset and sharing them here! 🙌

🥠[ Dataset ] : maxiw/hf-posts

.
.
.
@prithivMLmods

reacted to clem's post with 🚀 3 months ago

Post

1998

I've been in Brazil for 10 days now 🇧🇷🇧🇷🇧🇷

I've been surprised by the gap between the massive number of people interested in AI (chatgpt adoption is crazy here) and the relatively low number of real AI builders - aka people and companies building their own AI models, datasets and apps.

Lots of efforts needed across the world for everyone to participate, control and benefit this foundational technology, starting with open-source & multi-lingual AI, more access to GPUs & AI builder training for all!

posted an update 4 months ago

Post

2038

I’ve published a new dataset to simplify model merging 🤗

This dataset facilitates the search for compatible architectures for model merging with @arcee_ai’s mergekit, streamlining the automation of high-performance merge searches 📖

Dataset : louisbrulenaudet/mergekit-configs

1 reply

·

reacted to m-ric's post with 🔥 4 months ago

Post

3186

𝗤𝘄𝗲𝗻𝟮.𝟱-𝗖𝗼𝗱𝗲𝗿-𝟯𝟮𝗕: 𝗻𝗲𝘄 𝗯𝗲𝘀𝘁-𝗶𝗻-𝗰𝗹𝗮𝘀𝘀 𝗼𝗽𝗲𝗻 𝗰𝗼𝗱𝗶𝗻𝗴 𝗺𝗼𝗱𝗲𝗹, 𝗯𝗲𝗮𝘁𝘀 𝗚𝗣𝗧-𝟰𝗼 𝗼𝗻 𝗺𝗼𝘀𝘁 𝗰𝗼𝗱𝗶𝗻𝗴 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀!💥

💪 It's the first time Open-Source coding model of this size class that clearly matches GPT-4o's coding capabilities!

✨ Completes the previous two Qwen 2.5 Coder release with 4 new size: 0.5B, 3B, 14B, 32B
📚 Support long context up to 128K (for the 14B and 32B models)
✅ Drop-in replacement to GPT-4o as a coding assistant on Cursor or for Artifacts!
🤗 Models available right now on the Hub, under Apache 2.0 license!

They have setup a crazy Artifacts demo, you should go have a look!
👉 Qwen/Qwen2.5-Coder-Artifacts

reacted to m-ric's post with 👀 4 months ago

Post

2388

A non-Instruct LLM assistant is mostly useless. 🧐

Since it's mostly a model trained to complete text, when you ask it a question like "What to do during a stopover in Paris?", it can just go on and on adding more details to your question instead of answering, which would be valid to complete text from its training corpus, but not to answer questions.

➡️ So the post-training stage includes an important Instruction tuning step where you teach your model how to be useful : answer questions, be concise, be polite... RLHF is a well known technique for this.

For people interested to understand how this step works, the folks at Adaptive ML have made a great guide!

Read it here 👉 https://www.adaptive-ml.com/post/from-zero-to-ppo

reacted to prithivMLmods's post with 🤝 4 months ago

Post

5921

New Style, New Mix, New Drop 🧤

🧨Flux LoRA DLC: prithivMLmods/FLUX-LoRA-DLC

🎆Glowing-Body: prithivMLmods/Glowing-Body-Flux-LoRA
🎆Electric-Blue: prithivMLmods/Electric-Blue-Flux-LoRA
🎆Intense-Red: prithivMLmods/Intense-Red-Flux-LoRA
🎆Clouds-Illusion: prithivMLmods/Clouds-Illusion-Flux-LoRA
🎆Digital-Yellow: prithivMLmods/Digital-Yellow-Flux-LoRA

🧨Flux LoRA Collection: prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be

.
.
.
@prithivMLmods

Louis Brulé Naudet PRO

AI & ML interests

Recent Activity

Organizations

louisbrulenaudet's activity