BioBERT-NLI
This is the model BioBERT [1] fine-tuned on the SNLI and the MultiNLI datasets using the sentence-transformers
library to produce universal sentence embeddings [2].
The model uses the original BERT wordpiece vocabulary and was trained using the average pooling strategy and a softmax loss.
Base model: monologg/biobert_v1.1_pubmed
from HuggingFace's AutoModel
.
Training time: ~6 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.
Parameters:
Parameter | Value |
---|---|
Batch size | 64 |
Training steps | 30000 |
Warmup steps | 1450 |
Lowercasing | False |
Max. Seq. Length | 128 |
Performances: The performance was evaluated on the test portion of the STS dataset using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.
Model | Score |
---|---|
biobert-nli (this) |
73.40 |
gsarti/scibert-nli |
74.50 |
bert-base-nli-mean-tokens [3] |
77.12 |
An example usage for similarity-based scientific paper retrieval is provided in the Covid Papers Browser repository.
References:
[1] J. Lee et al, BioBERT: a pre-trained biomedical language representation model for biomedical text mining
[2] A. Conneau et al., Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
[3] N. Reimers et I. Gurevych, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
- Downloads last month
- 1,928