julien-c HF staff commited on
Commit
153d341
1 Parent(s): 44b34b9

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/gsarti/biobert-nli/README.md

Files changed (1) hide show
  1. README.md +37 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # BioBERT-NLI
2
+
3
+ This is the model [BioBERT](https://github.com/dmis-lab/biobert) [1] fine-tuned on the [SNLI](https://nlp.stanford.edu/projects/snli/) and the [MultiNLI](https://www.nyu.edu/projects/bowman/multinli/) datasets using the [`sentence-transformers` library](https://github.com/UKPLab/sentence-transformers/) to produce universal sentence embeddings [2].
4
+
5
+ The model uses the original BERT wordpiece vocabulary and was trained using the **average pooling strategy** and a **softmax loss**.
6
+
7
+ **Base model**: `monologg/biobert_v1.1_pubmed` from HuggingFace's `AutoModel`.
8
+
9
+ **Training time**: ~6 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.
10
+
11
+ **Parameters**:
12
+
13
+ | Parameter | Value |
14
+ |------------------|-------|
15
+ | Batch size | 64 |
16
+ | Training steps | 30000 |
17
+ | Warmup steps | 1450 |
18
+ | Lowercasing | False |
19
+ | Max. Seq. Length | 128 |
20
+
21
+ **Performances**: The performance was evaluated on the test portion of the [STS dataset](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.
22
+
23
+ | Model | Score |
24
+ |-------------------------------|-------------|
25
+ | `biobert-nli` (this) | 73.40 |
26
+ | `gsarti/scibert-nli` | 74.50 |
27
+ | `bert-base-nli-mean-tokens`[3]| 77.12 |
28
+
29
+ An example usage for similarity-based scientific paper retrieval is provided in the [Covid Papers Browser](https://github.com/gsarti/covid-papers-browser) repository.
30
+
31
+ **References:**
32
+
33
+ [1] J. Lee et al, [BioBERT: a pre-trained biomedical language representation model for biomedical text mining](https://academic.oup.com/bioinformatics/article/36/4/1234/5566506)
34
+
35
+ [2] A. Conneau et al., [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://www.aclweb.org/anthology/D17-1070/)
36
+
37
+ [3] N. Reimers et I. Gurevych, [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://www.aclweb.org/anthology/D19-1410/)