Sasha commited on
Commit
7c389a6
1 Parent(s): 695e7dd

added README

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - uk
4
+ tags:
5
+ - t5
6
+ ---
7
+
8
+ The aim is to compress the mT5-base model to leave only the Ukrainian language.
9
+
10
+ Reproduced the similar result but with other language from [this](https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90) medium article.
11
+
12
+ Results:
13
+ - 582M params -> 244M params
14
+ - 250K tokens -> 30K tokens
15
+ - 2.2GB size model -> 0.95GB size model
16
+
17
+ The vocabulary consists of 20K Ukrainian tokens and around 10K of English + most used + special tokens the T5 model uses.