File size: 533 Bytes
7c389a6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
---
language:
- uk
tags:
- t5
---
The aim is to compress the mT5-base model to leave only the Ukrainian language.
Reproduced the similar result but with other language from [this](https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90) medium article.
Results:
- 582M params -> 244M params
- 250K tokens -> 30K tokens
- 2.2GB size model -> 0.95GB size model
The vocabulary consists of 20K Ukrainian tokens and around 10K of English + most used + special tokens the T5 model uses. |