|
--- |
|
language: |
|
- uk |
|
tags: |
|
- t5 |
|
--- |
|
|
|
The aim is to compress the mT5-base model to leave only the Ukrainian language. |
|
|
|
Reproduced the similar result but with other language from [this](https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90) medium article. |
|
|
|
Results: |
|
- 582M params -> 244M params |
|
- 250K tokens -> 30K tokens |
|
- 2.2GB size model -> 0.95GB size model |
|
|
|
The vocabulary consists of 20K Ukrainian tokens and around 10K of English + most used + special tokens the T5 model uses. |