# distilbert-base-uncased trained for Semantic Textual Similarity in Spanish This is a test model that was fine-tuned using the Spanish datasets from [stsb_multi_mt](https://huggingface.co/datasets/stsb_multi_mt) in order to understand and benchmark STS models. Evaluating `distilbert-base-uncased` on the Spanish test dataset before training results in: ``` Cosine-Similarity : Pearson: 0.2980 Spearman: 0.4008 ``` While the fine-tuned version with the defaults of the training script and the Spanish training dataset results in: ``` Cosine-Similarity : Pearson: 0.7451 Spearman: 0.7364 ``` ## Resources Check the modified training script [training_stsb_m_mt.py] Check [sts_eval](https://github.com/eduardofv/sts_eval) for a comparison with Tensorflow and Sentence-Transformers models Check the [development environment](https://github.com/eduardofv/ai-denv)