165 118 347

Alvaro Bartolome

alvarobartt

https://alvarobartt.me

AI & ML interests

☁️ cloud machine learning @huggingface and open source passionate

Recent Activity

upvoted a paper 5 days ago

Spectrum: Targeted Training on Signal to Noise Ratio

liked a model 6 days ago

Qwen/Qwen-VL-Chat

updated a model 7 days ago

merve/paligemma_vqav2

View all activity

Articles

💨 Introducing Notus: a DPO fine-tune of Zephyr with a focus on high-quality data

Dec 1, 2023

• 1

🤗 LLM suggestions in Argilla with HuggingFace Inference Endpoints

Sep 20, 2023

• 1

Organizations

alvarobartt's activity

reacted to AdinaY's post with 🚀🔥 26 days ago

Post

1586

🌊 The wave of reasoning models from the Chinese community has arrived!

🚀 Marco-o1 by AIDC, Alibaba
👉 AIDC-AI/Marco-o1

✨ QwQ by Qwen, Alibaba
👉 Qwen/qwq-674762b79b75eac01735070a

🌟 Skywork-o1 by Kunlun Tech
👉 Skywork/skywork-o1-open-67453df58e12f6c3934738d0

🔥 Xkev/Llama-3.2V-11B-cot by PKU Yuan group
👉 Xkev/Llama-3.2V-11B-cot

💡 DeepSeek-R1-Lite-Preview by DeepSeek AI
👉 https://chat.deepseek.com/

🔍 InternThinker Preview by Shanghai AI Lab
👉 https://sso.openxlab.org.cn/login?redirect=https://internlm-chat.intern-ai.org.cn/&clientId=ebmrvod6yo0nlzaek1yp

📘 k0-math by Moonshot AI
🚀 https://kimi.moonshot.cn/ ( coming soon! )

Who's next? 👀
zh-ai-community/reasoning-models-67409fb3aa1ed78f10087cd7

reacted to andito's post with 🔥👀 27 days ago

Post

1784

SmolVLM speeding locally on a laptop thanks to mlx-vlm and
@Gradio ! Try it with two lines:
pip install git+https://github.com/andimarafioti/mlx-vlm.git@stream-generate-fix
python -m mlx_vlm.chat_ui --model mlx-community/SmolVLM-Instruct-8bit

Gotta love the MLX community! Big thanks to @pcuenq and @prince_canuma !

reacted to danielhanchen's post with 🔥 about 1 month ago

Post

1347

Vision finetuning is in 🦥Unsloth! You can now finetune Llama 3.2, Qwen2 VL, Pixtral and all Llava variants up to 2x faster and with up to 70% less VRAM usage! Colab to finetune Llama 3.2: https://colab.research.google.com/drive/1j0N4XTY1zXXy7mPAhOC1_gMYZ2F2EBlk?usp=sharing

1 reply

reacted to clem's post with 🔥 2 months ago

Post

4436

This is no Woodstock AI but will be fun nonetheless haha. I’ll be hosting a live workshop with team members next week about the Enterprise Hugging Face hub.

1,000 spots available first-come first serve with some surprises during the stream!

You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM

4 replies

posted an update 4 months ago

Post

2848

🤗 Serving Meta Llama 3.1 405B on Google Cloud is now possible via the Hugging Face Deep Learning Containers (DLCs) for Text Generation Inference (TGI)

In this post, we showcase how to deploy https://huggingface.co./meta-llama/Meta-Llama-3.1-405B-Instruct-FP8 on an A3 instance with 8 x H100 GPUs on Vertex AI

Thanks to the Hugging Face DLCs for TGI and Google Cloud Vertex AI, deploying a high-performance text generation container for serving Large Language Models (LLMs) has never been easier. And we’re not going to stop here – stay tuned as we enable more experiences to build AI with open models on Google Cloud!

Read the full post at https://huggingface.co./blog/llama31-on-vertex-ai

reacted to merve's post with 🔥 5 months ago

Post

4973

Idefics3-Llama is out! 💥💥
Model: HuggingFaceM4/Idefics3-8B-Llama3
Demo: HuggingFaceM4/idefics3

It's a multimodal model based on Llama 3.1 that accepts an arbitrary number of interleaved images with text with a huge context window (10k tokens!) ✨

Supported by Hugging Face transformers 🤗

2 replies

reacted to dvilasuero's post with 🤗🚀🔥 7 months ago

Post

8073

Today is a huge day in Argilla’s history. We couldn’t be more excited to share this with the community: we’re joining Hugging Face!

We’re embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.

Over the past year, we’ve been collaborating with Hugging Face on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyr’s learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets

After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, we’re now the same team.

To those of you who’ve been following us, this won’t be a huge surprise, but it will be a big deal in the coming months. This acquisition means we’ll double down on empowering the community to build and collaborate on high quality datasets, we’ll bring full support for multimodal datasets, and we’ll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.

As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amélie.

Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.

Would love to answer any questions you have so feel free to add them below!

28 replies

reacted to tomaarsen's post with 🔥❤️ 7 months ago

Post

1984

‼️Sentence Transformers v3.0 is out! You can now train and finetune embedding models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also release 50+ datasets to train on.

1️⃣ Training Refactor
Embedding models can now be trained using an extensive trainer with a lot of powerful features:
- MultiGPU Training (Data Parallelism (DP) and Distributed Data Parallelism (DDP))
- bf16 training support; loss logging
- Evaluation datasets + evaluation loss
- Improved callback support + an excellent Weights & Biases integration
- Gradient checkpointing, gradient accumulation
- Improved model card generation
- Resuming from a training checkpoint without performance loss
- Hyperparameter Optimization
and much more!
Read my detailed blogpost to learn about the components that make up this new training approach: https://huggingface.co./blog/train-sentence-transformers

2️⃣ Similarity Score
Not sure how to compare embeddings? Don't worry, you can now use model.similarity(embeddings1, embeddings2) and you'll get your similarity scores immediately. Model authors can specify their desired similarity score, so you don't have to worry about it anymore!

3️⃣ Additional Kwargs
Sentence Transformers relies on various Transformers instances (AutoModel, AutoTokenizer, AutoConfig), but it was hard to provide valuable keyword arguments to these (like 'torch_dtype=torch.bfloat16' to load a model a lower precision for 2x inference speedup). This is now easy!

4️⃣ Hyperparameter Optimization
Sentence Transformers now ships with HPO, allowing you to effectively choose your hyperparameters for your data and task.

5️⃣ Dataset Release
To help you out with finetuning models, I've released 50+ ready-to-go datasets that can be used with training or finetuning embedding models: sentence-transformers/embedding-model-datasets-6644d7a3673a511914aa7552

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.0.0

reacted to merve's post with 🚀 7 months ago

Post

1758

New open Vision Language Model by @Google : PaliGemma 💙🤍

📝 Comes in 3B, pretrained, mix and fine-tuned models in 224, 448 and 896 resolution
🧩 Combination of Gemma 2B LLM and SigLIP image encoder
🤗 Supported in transformers

PaliGemma can do..
🧩 Image segmentation and detection! 🤯
📑 Detailed document understanding and reasoning
🙋 Visual question answering, captioning and any other VLM task!

Read our blog 🔖 hf.co/blog/paligemma
Try the demo 🪀 hf.co/spaces/google/paligemma
Check out the Spaces and the models all in the collection 📚 google/paligemma-release-6643a9ffbf57de2ae0448dda
Collection of fine-tuned PaliGemma models google/paligemma-ft-models-6643b03efb769dad650d2dda

13 replies

posted an update 8 months ago

Post

3098

🔥 Prometheus 2 was recently released by Kaist AI as an alternative and closely mirroring both human and GPT-4 evaluation, and surpassing the former Prometheus!

prometheus-eval/prometheus-7b-v2.0
prometheus-eval/prometheus-8x7b-v2.0

🌬️Fine-tuned on top of mistralai/Mistral-7B-Instruct-v0.2 and mistralai/Mixtral-8x7B-Instruct-v0.1
🗂️The datasets used for fine-tuning have been publicly released i.e. prometheus-eval/Feedback-Collection and prometheus-eval/Preference-Collection
🤝🏻Unified LM evaluator for absolute (a single prompt-completion pair) and relative (two completions for a given prompt) due to model merging
❌No longer needs a mandatory reference / golden answer, but can still be provided optionally
🔝Surpasses the former version of Prometheus, and has a high correlation with human, GPT-4, and Claude 3 Opus scores when evaluating LMs
📝Apache 2.0 license

Long-story short, an amazing job from Kaist AI bridging the gap with LLM evaluators other than proprietary and bigger models!

This week at Argilla, we decided to add a new task to use Prometheus 2 as an LLM evaluator using distilabel, so we implemented PrometheusEval.

😱 Using PrometheusEval running their 7B variant with vLLM in a single L40 on top of HuggingFaceH4/instruction-dataset, we got the 327 existing prompt-completion pairs evaluated and pushed to the Hub in less than 2 minutes!

Find the generated dataset and the code at distilabel-internal-testing/instruction-dataset-prometheus

1 reply

reacted to davanstrien's post with 🤗 8 months ago

Post

1671

As part of the Data is Better Together MPEP project, we are now at the point where some translation efforts have successfully translated 500 highly ranked prompts into a new target language (amazing work from @Rijgersberg et al!)

Our next step is to use these translated prompts to evaluate the performance of LLMs for non English languages.

Does LLM, as a judge, work outside of English?

Ideally, it would be compelling to leverage LLMs to judge models for non-English since this significantly lowers the barrier to evaluating models (although it doesn't remove this barrier altogether).

What we want to know is:
- does auto/LLM eval work in general for a particular language
- which model(s) works best as a judge
- do LLMs' judgments of non-English models match human preferences?

We're starting to think about how to approach this. If you have any ideas of possible approaches feel free to comment or join the discussion here: https://github.com/huggingface/data-is-better-together/issues/61

Other ideas...

Could an approach like Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models (2404.18796) with the SOA models for a particular language work? i.e., choose 4 of the best open LLMs for Arabic and use those at the pool of raters rather than relying on one powerful judge LLM?

reacted to davanstrien's post with 🤗 8 months ago

Post

1538

Can you create domain-specific synthetic datasets in under 20 minutes?

@burtenshaw recently launched the Domain Specific Dataset Project as part of Data is Better Together. As part of this, Ben created a Space that you can use to define some key perspectives and concepts from a domain. This seed dataset can then be used to generate a synthetic dataset for a particular domain.

In less than 30 minutes this afternoon, I created a domain-specific dataset focused on data-centric machine learning using these tools: davanstrien/data-centric-ml-sft.

You can create your own domain specific datasets using this approach. Find the steps to follow here: https://github.com/huggingface/data-is-better-together/blob/main/domain-specific-datasets/README.md

1 reply

reacted to tomaarsen's post with 🔥 8 months ago

Post

3172

🚀 Sentence Transformers v2.7.0 is out! Featuring a new loss function, easier Matryoshka model inference & evaluation, CrossEncoder improvements & Intel Gaudi2 Accelerator support. Details:

1️⃣ A new loss function: CachedGISTEmbedLoss
This loss function is a combination of CachedMultipleNegativesRankingLoss and the GISTEmbedLoss, both of which are already excellent. The caching mechanism allows for much higher batch sizes with constant memory usage, which boosts training performance. The GIST part introduces a guide model to guide the in-batch negative sample selection. This prevents false negatives, resulting in a stronger training signal.

2️⃣ Automatic Matryoshka model truncation
Matryoshka models produce embeddings that are still useful after truncation. However, this truncation always had to be done manually, until now! We've added a truncate_dim option to the Sentence Transformer constructor. This also allows truncation when using HuggingFaceEmbeddings from LlamaIndex or LangChain.

3️⃣ Additionally, you can now specify truncate_dim in evaluators to get the performance after truncation. (Hint: it's surprisingly good, even for models not trained with MatryoshkaLoss, and it can speed up e.g. clustering, retrieval, etc.)

4️⃣ CrossEncoder improvements
The CrossEncoder now supports 'push_to_hub' to upload trained reranker models to Hugging Face. Additionally, CrossEncoders now support trust_remote_code to load models with custom modelling code.

5️⃣ Inference on Intel Gaudi2
If you have an Intel Gaudi2 Accelerator, Sentence Transformers now uses it automatically for even faster inference. No changes are necessary to your code, the device is automatically detected!

Check out the release notes for all of the details: https://github.com/UKPLab/sentence-transformers/releases/tag/v2.7.0

I'm very excited for the upcoming releases: I'm making great progress with a notable v3 refactor that should heavily improve the training process for embedding models!

2 replies

reacted to sayakpaul's post with 🤗🔥 8 months ago

Post

2901

We're introducing experimental support for device_map in Diffusers 🤗

If you have multiple GPUs you want to use to distribute the pipeline models, you can do so. Additionally, this becomes more useful when you have multiple low-VRAM GPUs.

Documentation:
https://huggingface.co./docs/diffusers/main/en/training/distributed_inference#device-placement

🚨 Currently, only "balanced" device mapping strategy is supported.

Alvaro Bartolome

AI & ML interests

Recent Activity

Articles

🤗 Serve any model with Inference Endpoints + Custom Handlers

Introducing HUGS - Scale your AI with Open Models

Deploy Meta Llama 3.1 405B on Google Cloud Vertex AI

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

🧑‍⚖️ "Replacing Judges with Juries" using distilabel

Deploying 🤗 Hub models in Vertex AI

🏷️ Build AI Feedback (AIF) datasets for LLM alignment with ⚗️ distilabel

💨 Introducing Notus: a DPO fine-tune of Zephyr with a focus on high-quality data

🤗 LLM suggestions in Argilla with HuggingFace Inference Endpoints

Organizations

alvarobartt's activity