flax-sentence-embeddings
/

all_datasets_v3_MiniLM-L12

Sentence Similarity

sentence-transformers

feature-extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

asi commited on Jul 23, 2021

Commit

894997f

•

1 Parent(s): 34e74e9

:books: add documentation

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -50,9 +50,8 @@ card for more detailed information about the pre-training procedure.
 ## Fine-tuning
-The BERT model was pretrained on [BookCorpus](https://yknzhu.wixsite.com/mbweb), a dataset consisting of 11,038
-unpublished books and [English Wikipedia](https://en.wikipedia.org/wiki/English_Wikipedia) (excluding lists, tables and
-headers).
 ### Hyper parameters

 ## Fine-tuning
+We fine-tune the model using a contrastive objective. Formally, we compute the cosine similarity from each possible sentence pairs from the batch.
+We then apply the cross entropy loss by comparing with true pairs.
 ### Hyper parameters