milmor commited on
Commit
72c9d8f
1 Parent(s): c94d119

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -6,12 +6,40 @@ tags:
6
  ---
7
 
8
  # t5-small-spanish-nahuatl
9
-
10
  ## Model description
11
- This model is a T5 Transformer ([t5-small](https://huggingface.co/t5-small)) that was fine-tuned in spanish and nahuatl.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
 
14
  ## Evaluation results
 
15
  - Validation loss: 1.56
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- > Created by [Emilio Morales](https://huggingface.co/milmor)
 
6
  ---
7
 
8
  # t5-small-spanish-nahuatl
 
9
  ## Model description
10
+ This model is a T5 Transformer ([t5-small](https://huggingface.co/t5-small)) fine-tuned on 29,007 spanish and nahuatl sentences using 12890 samples collected from the web and 16117 samples from the Axolotl dataset.
11
+
12
+
13
+ ## Usage
14
+ ```python
15
+ from transformers import AutoModelForSeq2SeqLM
16
+ from transformers import AutoTokenizer
17
+
18
+ model = AutoModelForSeq2SeqLM.from_pretrained('hackathon-pln-es/t5-small-spanish-nahuatl')
19
+ tokenizer = AutoTokenizer.from_pretrained('hackathon-pln-es/t5-small-spanish-nahuatl')
20
+
21
+ model.eval()
22
+ sentence = 'muchas flores son blancas'
23
+ input_ids = tokenizer('translate Spanish to Nahuatl: ' + sentence, return_tensors='pt').input_ids
24
+ outputs = model.generate(input_ids)
25
+ # outputs = miak xochitl istak
26
+ outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
27
+ ```
28
 
29
 
30
  ## Evaluation results
31
+ The model is evaluated on 400 validation sentences.
32
  - Validation loss: 1.56
33
+ - BLEU: 0.13
34
+
35
+ _Note: Since the Axolotl corpus contains multiple misalignments, the real BLEU and Validation loss are slightly better._
36
+
37
+
38
+ ## References
39
+ - Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits
40
+ of transfer learning with a unified Text-to-Text transformer.
41
+
42
+ - Nahuatl: Gutierrez-Vasques, X., Sierra, G., & Pompa, I. H. (2016). Axolotl: a Web Accessible Parallel Corpus for Spanish-Nahuatl. In LREC.
43
+
44
 
45
+ > Created by [Emilio Morales](https://huggingface.co/milmor).