nicholasKluge
commited on
Commit
•
f8594f5
1
Parent(s):
93ba795
Update README.md
Browse files
README.md
CHANGED
@@ -37,23 +37,23 @@ inference:
|
|
37 |
length_penalty: 0.3
|
38 |
early_stopping: true
|
39 |
co2_eq_emissions:
|
40 |
-
emissions:
|
41 |
source: CodeCarbon
|
42 |
training_type: fine-tuning
|
43 |
geographical_location: United States of America
|
44 |
hardware_used: NVIDIA A100-SXM4-40GB
|
45 |
---
|
46 |
-
# TeenyTinyLlama-
|
47 |
|
48 |
TeenyTinyLlama is a pair of small foundational models trained in Brazilian Portuguese.
|
49 |
|
50 |
-
This repository contains a version of [TeenyTinyLlama-
|
51 |
|
52 |
## Details
|
53 |
|
54 |
- **Number of Epochs:** 3
|
55 |
- **Batch size:** 4
|
56 |
-
- **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e3, learning_rate =
|
57 |
- **GPU:** 1 NVIDIA A100-SXM4-40GB
|
58 |
- **Carbon emissions** stats are logged in this [file](emissions.csv).
|
59 |
|
@@ -71,8 +71,8 @@ import torch
|
|
71 |
|
72 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
73 |
|
74 |
-
tokenizer = AutoTokenizer.from_pretrained('nicholasKluge/TeenyTinyLlama-
|
75 |
-
model = AutoModelForCausalLM.from_pretrained('nicholasKluge/TeenyTinyLlama-
|
76 |
|
77 |
model.eval()
|
78 |
model.to(device)
|
@@ -133,7 +133,7 @@ The model will output something like:
|
|
133 |
|
134 |
@misc{nicholas22llama,
|
135 |
doi = {10.5281/zenodo.6989727},
|
136 |
-
url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-
|
137 |
author = {Nicholas Kluge Corrêa},
|
138 |
title = {TeenyTinyLlama},
|
139 |
year = {2023},
|
@@ -149,4 +149,4 @@ This repository was built as part of the RAIES ([Rede de Inteligência Artificia
|
|
149 |
|
150 |
## License
|
151 |
|
152 |
-
TeenyTinyLlama-
|
|
|
37 |
length_penalty: 0.3
|
38 |
early_stopping: true
|
39 |
co2_eq_emissions:
|
40 |
+
emissions: 2.53
|
41 |
source: CodeCarbon
|
42 |
training_type: fine-tuning
|
43 |
geographical_location: United States of America
|
44 |
hardware_used: NVIDIA A100-SXM4-40GB
|
45 |
---
|
46 |
+
# TeenyTinyLlama-460m-Chat
|
47 |
|
48 |
TeenyTinyLlama is a pair of small foundational models trained in Brazilian Portuguese.
|
49 |
|
50 |
+
This repository contains a version of [TeenyTinyLlama-460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m) (`TeenyTinyLlama-460m-Chat`) fine-tuned on the [Instruct-Aira Dataset version 2.0](https://huggingface.co/datasets/nicholasKluge/instruct-aira-dataset-v2).
|
51 |
|
52 |
## Details
|
53 |
|
54 |
- **Number of Epochs:** 3
|
55 |
- **Batch size:** 4
|
56 |
+
- **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e3, learning_rate = 1e-5, epsilon = 1e-8)
|
57 |
- **GPU:** 1 NVIDIA A100-SXM4-40GB
|
58 |
- **Carbon emissions** stats are logged in this [file](emissions.csv).
|
59 |
|
|
|
71 |
|
72 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
73 |
|
74 |
+
tokenizer = AutoTokenizer.from_pretrained('nicholasKluge/TeenyTinyLlama-460m-Chat')
|
75 |
+
model = AutoModelForCausalLM.from_pretrained('nicholasKluge/TeenyTinyLlama-460m-Chat')
|
76 |
|
77 |
model.eval()
|
78 |
model.to(device)
|
|
|
133 |
|
134 |
@misc{nicholas22llama,
|
135 |
doi = {10.5281/zenodo.6989727},
|
136 |
+
url = {https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m},
|
137 |
author = {Nicholas Kluge Corrêa},
|
138 |
title = {TeenyTinyLlama},
|
139 |
year = {2023},
|
|
|
149 |
|
150 |
## License
|
151 |
|
152 |
+
TeenyTinyLlama-460m-Chat is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
|