uk-mt5-base / README.md
Sasha
added README
7c389a6
metadata
language:
  - uk
tags:
  - t5

The aim is to compress the mT5-base model to leave only the Ukrainian language.

Reproduced the similar result but with other language from this medium article.

Results:

  • 582M params -> 244M params
  • 250K tokens -> 30K tokens
  • 2.2GB size model -> 0.95GB size model

The vocabulary consists of 20K Ukrainian tokens and around 10K of English + most used + special tokens the T5 model uses.