luisarmando commited on
Commit
f6a2543
1 Parent(s): 08f8865

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -3
README.md CHANGED
@@ -9,7 +9,7 @@ tags:
9
  - PyTorch
10
  - Safetensors
11
  widget:
12
- - text: 'translate spanish to nahuatl: Quiero agua. or translate nahuatl to spanish: Nimitstlazohkamate.'
13
  ---
14
 
15
  # mt5-large-spanish-nahuatl
@@ -19,6 +19,14 @@ Nahuatl is the most widely spoken indigenous language in Mexico, yet training a
19
  ## Model description
20
  This model is an MT5 Transformer ([mt5-large](https://huggingface.co/google/mt5-large)) fine-tuned on Spanish and Nahuatl sentences collected from diverse places online. The dataset is normalized using 'inali' normalization from [py-elotl](https://github.com/ElotlMX/py-elotl).
21
 
 
 
 
 
 
 
 
 
22
 
23
  ## Usage
24
  ```python
@@ -71,12 +79,12 @@ Also, additional 30,000 samples were collected from the web to enhance the data.
71
  The employed method uses a single training stage using the mt5. This model was leveraged given that it can handle different vocabularies and prefixes.
72
 
73
  ### Training
74
- The model is trained till convergence, adding the prefixes "translate spanish to nahuatl: + word" and "translate nahuatl to spanish: + word".
 
75
 
76
  ### Training setup
77
  The model uses the same dataset for 77,500 steps using batch size = 4 and a learning rate of 1e-4.
78
 
79
-
80
  ## Evaluation results
81
  The models are evaluated on 2 different datasets:
82
  1. First on the test sentences similar to the evaluation ones.
 
9
  - PyTorch
10
  - Safetensors
11
  widget:
12
+ - text: 'translate nahuatl to spanish: Nimitstlazohkamate'
13
  ---
14
 
15
  # mt5-large-spanish-nahuatl
 
19
  ## Model description
20
  This model is an MT5 Transformer ([mt5-large](https://huggingface.co/google/mt5-large)) fine-tuned on Spanish and Nahuatl sentences collected from diverse places online. The dataset is normalized using 'inali' normalization from [py-elotl](https://github.com/ElotlMX/py-elotl).
21
 
22
+ ## Inference API use
23
+ You can change and translate to nahuatl or spanish just do:
24
+
25
+ translate spanish to nahuatl: Quiero agua
26
+
27
+ or
28
+
29
+ translate nahuatl to spanish: Nimitstlazohkamate
30
 
31
  ## Usage
32
  ```python
 
79
  The employed method uses a single training stage using the mt5. This model was leveraged given that it can handle different vocabularies and prefixes.
80
 
81
  ### Training
82
+ The model is trained bidirectionally till convergence, adding the prefixes "translate spanish to nahuatl: + word" and "translate nahuatl to spanish: + word".
83
+ This is meant as an improvement of [previous models](https://huggingface.co/hackathon-pln-es/t5-small-spanish-nahuatl).
84
 
85
  ### Training setup
86
  The model uses the same dataset for 77,500 steps using batch size = 4 and a learning rate of 1e-4.
87
 
 
88
  ## Evaluation results
89
  The models are evaluated on 2 different datasets:
90
  1. First on the test sentences similar to the evaluation ones.