Natalia Elvira

nataliaElv

AI & ML interests

Data curation, high-quality data, multilinguality, NLP & computational linguistics

Recent Activity

Articles

Organizations

Hugging Face's profile picture SomosNLP's profile picture Argilla's profile picture Blog-explorers's profile picture Argilla Explorers's profile picture Data Is Better Together's profile picture HuggingFaceFW-Dev's profile picture Hugging Face Discord Community's profile picture argilla-internal-testing's profile picture Argilla Warehouse's profile picture Dataset Tools's profile picture Coordination Nationale pour l'IA's profile picture Data Is Better Together Contributor's profile picture Bluesky Community's profile picture

nataliaElv's activity

posted an update 8 days ago
view post
Post
1598
If you are still wondering how the FineWeb2 annotations are done, how to follow the guidelines or how Argilla works, this is your video!

I go through a few samples of the FineWeb2 dataset and classify them based on their educational content. Check it out!

https://www.youtube.com/watch?v=_-ORB4WAVGU
posted an update 14 days ago
view post
Post
1244
How do your annotations for FineWeb2 compare to your teammates'?

I started contributing some annotations to the FineWeb2 collaborative annotation sprint and I wanted to know if my labelling trends were similar to those of my teammates.

I did some analysis and I wasn't surprised to see that I'm being a bit harsher on my evaluations than my mates 😂


Do you want to see how your annotations compare to others?
👉 Go to this Gradio space: nataliaElv/fineweb2_compare_my_annotations
✍️ Enter the dataset that you've contributed to and your Hugging Face username.

How were your results?
- Contribute some annotations: data-is-better-together/fineweb-c
- Join your language channel in Rocket chat: HuggingFaceFW/discussion
posted an update 22 days ago
view post
Post
1177
We're so close to reaching 100 languages! Can you help us cover the remaining 200? Check if we're still looking for language leads for your language: nataliaElv/language-leads-dashboard
posted an update 28 days ago
view post
Post
1627
Would you like to get a high-quality dataset to pre-train LLMs in your language? 🌏

At Hugging Face we're preparing a collaborative annotation effort to build an open-source multilingual dataset as part of the Data is Better Together initiative.

Follow the link below, check if your language is listed and sign up to be a Language Lead!

https://forms.gle/s9nGajBh6Pb9G72J6

Language tags

1
#1 opened 29 days ago by
nataliaElv
New activity in nataliaElv/argilla-progress 29 days ago

Update app.py

#1 opened 30 days ago by
davidberenstein1957
posted an update about 1 month ago
view post
Post
361
You can now add your Bluesky handle to your Hugging Face profile! 🦋
Have you noticed?
New activity in huggingface-course/documentation-images about 1 month ago

More Argilla screenshots

#4 opened about 1 month ago by
nataliaElv

argilla-chapter-images

#3 opened about 1 month ago by
nataliaElv