danielheinz
/

e5-base-sts-en-de

Feature Extraction

text-embeddings-inference

Model card Files Files and versions Community

INFO: The model is being continuously updated.

The model is a multilingual-e5-base model fine-tuned with the task of semantic textual similarity in mind.

Model Training

The model has been fine-tuned on the German subsets of the following datasets:

The training procedure can be divided into two stages:

training on paraphrase datasets with the Multiple Negatives Ranking Loss
training on semantic textual similarity datasets using the Cosine Similarity Loss

Results

The model achieves the following results:

0.920 on stsb's validation subset
0.904 on stsb's test subset

Downloads last month: 14,463

Safetensors

Model size

278M params

Tensor type

F32

·

Datasets used to train danielheinz/e5-base-sts-en-de

Spaces using danielheinz/e5-base-sts-en-de 3

Evaluation results

spearmanr on stsb_multi_mt
self-reported

0.904

View on Papers With Code