ibaucells commited on
Commit
37a9a78
1 Parent(s): 50ab8b7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ tags:
12
 
13
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
15
- It has been developed through further training of a multilingual fine-tuned model, [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) using NLI data. Specifically, it was trained on two Catalan NLI datasets: [TE-ca](https://huggingface.co/datasets/projecte-aina/teca) and the professional translation of XNLI into Catalan. The training employed the Multiple Negatives Ranking Loss with Hard Negatives, which leverages triplets composed of a premise, an entailed hypothesis, and a contradiction. It is important to note that, given this format, neutral hypotheses from the NLI datasets were not used for training. However, as a form of data augmentation, the model's training set was expanded by duplicating the triplets, wherein the order of the premise and entailed hypothesis was reversed, resulting in a total of 18,928 triplets.
16
 
17
  ## Usage (Sentence-Transformers)
18
 
 
12
 
13
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
15
+ It has been developed through further training of a multilingual fine-tuned model, [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) using NLI data. Specifically, it was trained on two Catalan NLI datasets: [TE-ca](https://huggingface.co/datasets/projecte-aina/teca) and the professional translation of XNLI into Catalan. The training employed the Multiple Negatives Ranking Loss with Hard Negatives, which leverages triplets composed of a premise, an entailed hypothesis, and a contradiction. It is important to note that, given this format, neutral hypotheses from the NLI datasets were not used for training. Additionally, as a form of data augmentation, the model's training set was expanded by duplicating the triplets, wherein the order of the premise and entailed hypothesis was reversed, resulting in a total of 18,928 triplets.
16
 
17
  ## Usage (Sentence-Transformers)
18