--- language: - pt metrics: - perplexity pipeline_tag: text-generation --- # Model Card for Model ID A Portuguese language model trained on https://huggingface.co/facebook/opt-125m . ## Model Details ### Model Description - **Developed by:** Monique Monteiro - **Shared by [optional]:** Monique Monteiro - **Model type:** OPT - **Language(s) (NLP):** Portuguese - **License:** [More Information Needed] - **Finetuned from model [optional]:** facebook/opt-125m Use the code below to get started with the model. ```python generator = pipeline('text-generation', 'monilouise/opt125M_portuguese') output = generator("Era uma vez", max_length=50, do_sample=True) ``` ## Training Details ### Training Data The model was trained on gs://unicamp-dl/ia025a_2022s1/aula9/sample-1gb.txt ### Training Procedure The model was trained for 3 epochs, by using learning rate = 5e-5 (linear scheduler). #### Preprocessing [optional] All text was tokenized and broken into chunks of 1024 tokens. #### Training Hyperparameters - **Training regime:** fp16 mixed precision #### Speeds, Sizes, Times [optional] Training time: 17 hours ## Evaluation The model was evaluated on a 5% validation split. #### Metrics Perplexity = 7.94. ## Model Card Authors [optional] moniquelouise@gmail.com ## Model Card Contact moniquelouise@gmail.com