Sasha
commited on
Commit
•
7c389a6
1
Parent(s):
695e7dd
added README
Browse files
README.md
ADDED
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- uk
|
4 |
+
tags:
|
5 |
+
- t5
|
6 |
+
---
|
7 |
+
|
8 |
+
The aim is to compress the mT5-base model to leave only the Ukrainian language.
|
9 |
+
|
10 |
+
Reproduced the similar result but with other language from [this](https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90) medium article.
|
11 |
+
|
12 |
+
Results:
|
13 |
+
- 582M params -> 244M params
|
14 |
+
- 250K tokens -> 30K tokens
|
15 |
+
- 2.2GB size model -> 0.95GB size model
|
16 |
+
|
17 |
+
The vocabulary consists of 20K Ukrainian tokens and around 10K of English + most used + special tokens the T5 model uses.
|