vocabtrimmer
/

mt5-small-esquad-qa-trimmed-es-90000

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

mt5-small-esquad-qa-trimmed-es-90000 / README.md

asahi417's picture

commit files to HF hub

7901b53 over 1 year ago

|

raw history blame contribute delete

No virus

1.74 kB

	# Vocabulary Trimmed [lmqg/mt5-small-esquad-qa](https://huggingface.co/lmqg/mt5-small-esquad-qa): `vocabtrimmer/mt5-small-esquad-qa-trimmed-es-90000`
	This model is a trimmed version of [lmqg/mt5-small-esquad-qa](https://huggingface.co/lmqg/mt5-small-esquad-qa) by [`vocabtrimmer`](https://github.com/asahi417/lm-vocab-trimmer), a tool for trimming vocabulary of language models to compress the model size.
	Following table shows a summary of the trimming process.

	\| \| lmqg/mt5-small-esquad-qa \| vocabtrimmer/mt5-small-esquad-qa-trimmed-es-90000 \|
	\|:---------------------------\|:---------------------------\|:----------------------------------------------------\|
	\| parameter_size_full \| 300,165,504 \| 136,224,128 \|
	\| parameter_size_embedding \| 256,103,424 \| 92,162,048 \|
	\| vocab_size \| 250,101 \| 90,002 \|
	\| compression_rate_full \| 100.0 \| 45.38 \|
	\| compression_rate_embedding \| 100.0 \| 35.99 \|


	Following table shows the parameter used to trim vocabulary.

	\| language \| dataset \| dataset_column \| dataset_name \| dataset_split \| target_vocab_size \| min_frequency \|
	\|:-----------\|:----------------------------\|:-----------------\|:---------------\|:----------------\|--------------------:\|----------------:\|
	\| es \| vocabtrimmer/mc4_validation \| text \| es \| validation \| 90000 \| 2 \|