Edit model card

OPT-125M finetuned Portuguese

Fine-tuning the OPT-125M model on a reduced corpus of mc4-Portuguese with approximately 300M tokens.

Hyper-parameters
  • learning_rate = 5e-5
  • batch_size = 32
  • warmup = 500
  • seq_length = 512
  • num_train_epochs = 2.0

With an A100 with 40GB of RAM, the training took around 3 hours

Perplexity: 9.4

Sample Use

from transformers import pipeline
generator = pipeline('text-generation', model='Mirelle/opt-125M-pt-br-finetuned', max_length=100, do_sample=True)
generator("Em uma bela manhã de")
Downloads last month
6

Dataset used to train Mirelle/opt-125M-pt-br-finetuned