pipeline_tag: sentence-similarity
language: fr
datasets:
- stsb_multi_mt
tags:
- Text
- Sentence Similarity
- Sentence-Embedding
- camembert-base
license: apache-2.0
model-index:
- name: CrossEncoder-camembert-large by Van Tuan DANG
results:
- task:
name: Sentence-Embedding
type: Text Similarity
dataset:
name: Text Similarity fr
type: stsb_multi_mt
args: fr
metrics:
- name: Test Pearson correlation coefficient
type: Pearson_correlation_coefficient
value: 90.34
Model
Cross-Encoder Model for sentence-similarity
This model was is an improvement over the dangvantuan/CrossEncoder-camembert-large offering greater robustness and better performance
Training Data
This model was trained on the STS benchmark dataset and has been combined with Augmented SBERT. The model benefits from Pair Sampling Strategies using two models: CrossEncoder-camembert-large and dangvantuan/sentence-camembert-large. The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.
Usage (Sentence-Transformers)
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
Then you can use the model like this:
from sentence_transformers import CrossEncoder
model = CrossEncoder('Lajavaness/CrossEncoder-camembert-large', max_length=512)
scores = model.predict([('Un avion est en train de décoller.', "Un homme joue d'une grande flûte."), ("Un homme étale du fromage râpé sur une pizza.", "Une personne jette un chat au plafond") ])
Evaluation
The model can be evaluated as follows on the French test data of stsb.
from sentence_transformers.readers import InputExample
from sentence_transformers.cross_encoder.evaluation import CECorrelationEvaluator
from datasets import load_dataset
def convert_dataset(dataset):
dataset_samples=[]
for df in dataset:
score = float(df['similarity_score'])/5.0 # Normalize score to range 0 ... 1
inp_example = InputExample(texts=[df['sentence1'],
df['sentence2']], label=score)
dataset_samples.append(inp_example)
return dataset_samples
# Loading the dataset for evaluation
df_dev = load_dataset("stsb_multi_mt", name="fr", split="dev")
df_test = load_dataset("stsb_multi_mt", name="fr", split="test")
# Convert the dataset for evaluation
# For Dev set:
dev_samples = convert_dataset(df_dev)
val_evaluator = CECorrelationEvaluator.from_input_examples(dev_samples, name='sts-dev')
val_evaluator(model, output_path="./")
# For Test set, the Pearson and Spearman correlation are evaluated on many different benchmark datasets:
test_samples = convert_dataset(df_test)
test_evaluator = CECorrelationEvaluator.from_input_examples(test_samples, name='sts-test')
test_evaluator(models, output_path="./")
Test Result: The performance is measured using Pearson and Spearman correlation:
- On dev
Model | Pearson correlation | Spearman correlation | #params |
---|---|---|---|
Lajavaness/CrossEncoder-camembert-large | 90.34 | 90.15 | 336M |
dangvantuan/CrossEncoder-camembert-large | 90.11 | 90.01 | 336M |
- On test:
Pearson score
Model | STS-B | STS12-fr | STS13-fr | STS14-fr | STS15-fr | STS16-fr | SICK-fr |
---|---|---|---|---|---|---|---|
Lajavaness/CrossEncoder-camembert-large | 0.8863 | 0.9076 | 0.8824 | 0.9022 | 0.9223 | 0.8231 | 0.8461 |
dangvantuan/CrossEncoder-camembert-large | 0.8816 | 0.9012 | 0.8836 | 0.8986 | 0.9204 | 0.8201 | 0.8423 |
Spearman score
Model | STS-B | STS12-fr | STS13-fr | STS14-fr | STS15-fr | STS16-fr | SICK-fr |
---|---|---|---|---|---|---|---|
Lajavaness/CrossEncoder-camembert-large | 0.8803 | 0.8487 | 0.8788 | 0.8910 | 0.9216 | 0.8250 | 0.8078 |
dangvantuan/CrossEncoder-camembert-large | 0.8757 | 0.8424 | 0.8801 | 0.8862 | 0.9199 | 0.8216 | 0.8038 |