OPT-125M finetuned Portuguese

Fine-tuning the OPT-125M model on a reduced corpus of mc4-Portuguese with approximately 300M tokens.

Hyper-parameters
  • learning_rate = 5e-5
  • batch_size = 32
  • warmup = 500
  • seq_length = 512
  • num_train_epochs = 2.0

With an A100 with 40GB of RAM, the training took around 3 hours

Perplexity: 9.4

Sample Use

from transformers import pipeline
generator = pipeline('text-generation', model='Mirelle/opt-125M-pt-br-finetuned', max_length=100, do_sample=True)
generator("Em uma bela manhã de")
Downloads last month
143
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Mirelle/opt-125M-pt-br-finetuned