--- pipeline_tag: sentence-similarity language: fr datasets: - stsb_multi_mt tags: - Text - Sentence Similarity - Sentence-Embedding - camembert-base license: apache-2.0 model-index: - name: sentence-flaubert-base by Van Tuan DANG results: - task: name: Sentence-Embedding type: Text Similarity dataset: name: Text Similarity fr type: stsb_multi_mt args: fr metrics: - name: Test Pearson correlation coefficient type: Pearson_correlation_coefficient value: xx.xx --- ## Pre-trained sentence embedding models are the state-of-the-art of Sentence Embeddings for French. Model is Fine-tuned using pre-trained [flaubert/flaubert_base_uncased](https://huggingface.co/flaubert/flaubert_base_uncased) and [Siamese BERT-Networks with 'sentences-transformers'](https://www.sbert.net/) combine with Augmented SBERT on dataset [stsb](https://huggingface.co/datasets/stsb_multi_mt/viewer/fr/train) ## Usage The model can be used directly (without a language model) as follows: ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("Lajavaness/sentence-flaubert-base") sentences = ["Un avion est en train de décoller.", "Un homme joue d'une grande flûte.", "Un homme étale du fromage râpé sur une pizza.", "Une personne jette un chat au plafond.", "Une personne est en train de plier un morceau de papier.", ] embeddings = model.encode(sentences) ```