File size: 533 Bytes
7c389a6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
---
language:
- uk
tags:
- t5
---

The aim is to compress the mT5-base model to leave only the Ukrainian language.

Reproduced the similar result but with other language from [this](https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90) medium article. 

Results: 
- 582M params -> 244M params
- 250K tokens -> 30K tokens
- 2.2GB size model -> 0.95GB size model

The vocabulary consists of 20K Ukrainian tokens and around 10K of English + most used + special tokens the T5 model uses.