Spanish GPT-2 trained on large_spanish_corpus
This is a Spanish GPT-2 model trained from scratch on the large_spanish_corpus aka BETO's corpus with Flax This is part of the Flax/Jax Community Week, organised by HuggingFace and TPU usage sponsored by Google.
Dataset
The dataset is about 20 GB. 95% of the data was used for training and the rest 5% for validation.
Metrics (on evaluation dataset)
- Loss: 2.413
- Perplexity: 11.36
Team members
- Manuel Romero (mrm8488)
- María Grandury (mariagrandury)
- Pablo González de Prado (Pablogps)
- Daniel Vera (daveni)
- Sri Lakshmi (srisweet)
- José Posada (jdposa)
- Santiago Hincapie (shpotes)
- Jorge (jorgealro)
Useful links
- Downloads last month
- 972
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.