osiria
/

flare-it

@@ -32,7 +32,7 @@ widget:
 <h3>Introduction</h3>
-This model is a <b>lightweight</b> and uncased version of <b>MiniLM</b> <b>[1]</b> for the <b>italian</b> language. Its <b>16M parameters</b> and <b>66MB</b> size make it
 <b>85% lighter</b> than a typical mono-lingual BERT model. It is ideal when memory consumption and execution speed are critical while maintaining high-quality results.
@@ -47,7 +47,7 @@ To compensate for the deletion of cased tokens, which now forces the model to ex
 the model has been further pre-trained on the italian split of the [Wikipedia](https://huggingface.co/datasets/wikipedia) dataset, using the <b>whole word masking [3]</b> technique to make it more robust
 to the new uncased representations.
-The resulting model has 16M parameters, a vocabulary of 14.610 tokens, and a size of 66MB, which makes it <b>85% lighter</b> than a typical mono-lingual BERT model and
 75% lighter than a standard mono-lingual DistilBERT model.

 <h3>Introduction</h3>
+This model is a <b>lightweight</b> and uncased version of <b>MiniLM</b> <b>[1]</b> for the <b>italian</b> language. Its <b>17M parameters</b> and <b>67MB</b> size make it
 <b>85% lighter</b> than a typical mono-lingual BERT model. It is ideal when memory consumption and execution speed are critical while maintaining high-quality results.
 the model has been further pre-trained on the italian split of the [Wikipedia](https://huggingface.co/datasets/wikipedia) dataset, using the <b>whole word masking [3]</b> technique to make it more robust
 to the new uncased representations.
+The resulting model has 17M parameters, a vocabulary of 14.610 tokens, and a size of 67MB, which makes it <b>85% lighter</b> than a typical mono-lingual BERT model and
 75% lighter than a standard mono-lingual DistilBERT model.