metadata
pipeline_tag: sentence-similarity
language: fr
datasets:
- stsb_multi_mt
tags:
- Text
- Sentence Similarity
- Sentence-Embedding
- camembert-base
license: apache-2.0
model-index:
- name: sentence-flaubert-base by Van Tuan DANG
results:
- task:
name: Sentence-Embedding
type: Text Similarity
dataset:
name: Text Similarity fr
type: stsb_multi_mt
args: fr
metrics:
- name: Test Pearson correlation coefficient
type: Pearson_correlation_coefficient
value: xx.xx
Pre-trained sentence embedding models are the state-of-the-art of Sentence Embeddings for French.
Model is Fine-tuned using pre-trained flaubert/flaubert_base_uncased and Siamese BERT-Networks with 'sentences-transformers' combine with Augmented SBERT on dataset stsb
Usage
The model can be used directly (without a language model) as follows:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Lajavaness/sentence-flaubert-base")
sentences = ["Un avion est en train de décoller.",
"Un homme joue d'une grande flûte.",
"Un homme étale du fromage râpé sur une pizza.",
"Une personne jette un chat au plafond.",
"Une personne est en train de plier un morceau de papier.",
]
embeddings = model.encode(sentences)