Update README.md
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ This version of the Google T5-Base model has been fine-tuned on a bilingual data
|
|
41 |
### Model Sources
|
42 |
|
43 |
<!-- Provide the basic links for the model. -->
|
44 |
-
- **Repository:** https://github.com/leks-forever/
|
45 |
<!-- - **Paper [optional]:** [More Information Needed] -->
|
46 |
<!-- - **Demo [optional]:** [More Information Needed] -->
|
47 |
|
@@ -82,11 +82,6 @@ print(translation)
|
|
82 |
|
83 |
The model was fine-tuned on the [bible-lezghian-russian](https://huggingface.co/datasets/leks-forever/bible-lezghian-russian) dataset, which contains 13,800 parallel sentences in Russian and Lezgian. The dataset was split into three parts: 90% for training, 5% for validation, and 5% for testing.
|
84 |
|
85 |
-
### Preprocessing
|
86 |
-
|
87 |
-
The preprocessing step included tokenization with a custom-trained SentencePiece NLLB-based tokenizer on the Russian-Lezgian corpus.
|
88 |
-
|
89 |
-
|
90 |
|
91 |
#### Training Hyperparameters
|
92 |
|
|
|
41 |
### Model Sources
|
42 |
|
43 |
<!-- Provide the basic links for the model. -->
|
44 |
+
- **Repository:** https://github.com/leks-forever/mt5-tuning
|
45 |
<!-- - **Paper [optional]:** [More Information Needed] -->
|
46 |
<!-- - **Demo [optional]:** [More Information Needed] -->
|
47 |
|
|
|
82 |
|
83 |
The model was fine-tuned on the [bible-lezghian-russian](https://huggingface.co/datasets/leks-forever/bible-lezghian-russian) dataset, which contains 13,800 parallel sentences in Russian and Lezgian. The dataset was split into three parts: 90% for training, 5% for validation, and 5% for testing.
|
84 |
|
|
|
|
|
|
|
|
|
|
|
85 |
|
86 |
#### Training Hyperparameters
|
87 |
|