Open Catalogue of European Datasets
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
Find below the guidelines to push your datasets within the Open Catalogue of European Datasets.
1️⃣ Learn how to push a dataset to the Hub 👉 https://github.com/bigscience-workshop/data-preparation/tree/main/sourcing/Gathering%20Identified%20Datasets%20and%20Collections
2️⃣ Use the following tools & code to pre-process your datasets 👉 https://github.com/bigscience-workshop/data-preparation/tree/main/preprocessing/training/01a_catalogue_cleaning_and_filtering
3️⃣ Push your processed dataset here using the following naming convention european-catalogue-data-processed-language-source
models
None public yet
datasets
None public yet