The model uses the original BERT wordpiece vocabulary and was trained using the average pooling strategy and a softmax loss.
monologg/biobert_v1.1_pubmed from HuggingFace's
Training time: ~6 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.
|Max. Seq. Length||128|
Performances: The performance was evaluated on the test portion of the STS dataset using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.
An example usage for similarity-based scientific paper retrieval is provided in the Covid Papers Browser repository.
 N. Reimers et I. Gurevych, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
- Downloads last month