nicholasKluge
/

TeenyTinyLlama-460m-Chat

@@ -37,23 +37,23 @@ inference:
     length_penalty: 0.3
     early_stopping: true
 co2_eq_emissions:
-  emissions: 0.83
   source: CodeCarbon
   training_type: fine-tuning
   geographical_location: United States of America
   hardware_used: NVIDIA A100-SXM4-40GB
 ---
-# TeenyTinyLlama-160m-Chat
 TeenyTinyLlama is a pair of small foundational models trained in Brazilian Portuguese.
-This repository contains a version of [TeenyTinyLlama-160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m) (`TeenyTinyLlama-160m-Chat`) fine-tuned on the [Instruct-Aira Dataset version 2.0](https://huggingface.co/datasets/nicholasKluge/instruct-aira-dataset-v2).
 ## Details
 - **Number of Epochs:** 3
 - **Batch size:** 4
-- **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e3, learning_rate = 5e-5, epsilon = 1e-8)
 - **GPU:** 1 NVIDIA A100-SXM4-40GB
 - **Carbon emissions** stats are logged in this [file](emissions.csv).
@@ -71,8 +71,8 @@ import torch
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-tokenizer = AutoTokenizer.from_pretrained('nicholasKluge/TeenyTinyLlama-160m-Chat')
-model = AutoModelForCausalLM.from_pretrained('nicholasKluge/TeenyTinyLlama-160m-Chat')
 model.eval()
 model.to(device)
@@ -133,7 +133,7 @@ The model will output something like:
 @misc{nicholas22llama,
   doi = {10.5281/zenodo.6989727},
-  url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m},
   author = {Nicholas Kluge Corrêa},
   title = {TeenyTinyLlama},
   year = {2023},
@@ -149,4 +149,4 @@ This repository was built as part of the RAIES ([Rede de Inteligência Artificia
 ## License
-TeenyTinyLlama-160m-Chat is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.

     length_penalty: 0.3
     early_stopping: true
 co2_eq_emissions:
+  emissions: 2.53
   source: CodeCarbon
   training_type: fine-tuning
   geographical_location: United States of America
   hardware_used: NVIDIA A100-SXM4-40GB
 ---
+# TeenyTinyLlama-460m-Chat
 TeenyTinyLlama is a pair of small foundational models trained in Brazilian Portuguese.
+This repository contains a version of [TeenyTinyLlama-460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m) (`TeenyTinyLlama-460m-Chat`) fine-tuned on the [Instruct-Aira Dataset version 2.0](https://huggingface.co/datasets/nicholasKluge/instruct-aira-dataset-v2).
 ## Details
 - **Number of Epochs:** 3
 - **Batch size:** 4
+- **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e3, learning_rate = 1e-5, epsilon = 1e-8)
 - **GPU:** 1 NVIDIA A100-SXM4-40GB
 - **Carbon emissions** stats are logged in this [file](emissions.csv).
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+tokenizer = AutoTokenizer.from_pretrained('nicholasKluge/TeenyTinyLlama-460m-Chat')
+model = AutoModelForCausalLM.from_pretrained('nicholasKluge/TeenyTinyLlama-460m-Chat')
 model.eval()
 model.to(device)
 @misc{nicholas22llama,
   doi = {10.5281/zenodo.6989727},
+  url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m},
   author = {Nicholas Kluge Corrêa},
   title = {TeenyTinyLlama},
   year = {2023},
 ## License
+TeenyTinyLlama-460m-Chat is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.