covidbert-nli / README.md
julien-c's picture
julien-c HF staff
Migrate model card from transformers-repo
798e9c1
|
raw
history blame
2.25 kB

CovidBERT-NLI

This is the model CovidBERT trained by DeepSet on AllenAI's CORD19 Dataset of scientific articles about coronaviruses.

The model uses the original BERT wordpiece vocabulary and was subsequently fine-tuned on the SNLI and the MultiNLI datasets using the sentence-transformers library to produce universal sentence embeddings [1] using the average pooling strategy and a softmax loss.

Parameter details for the original training on CORD-19 are available on DeepSet's MLFlow

Base model: deepset/covid_bert_base from HuggingFace's AutoModel.

Training time: ~6 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.

Parameters:

Parameter Value
Batch size 64
Training steps 23000
Warmup steps 1450
Lowercasing True
Max. Seq. Length 128

Performances: The performance was evaluated on the test portion of the STS dataset using Spearman rank correlation and compared to the performances of similar models obtained with the same procedure to verify its performances.

Model Score
covidbert-nli (this) 67.52
gsarti/biobert-nli 73.40
gsarti/scibert-nli 74.50
bert-base-nli-mean-tokens[2] 77.12

An example usage for similarity-based scientific paper retrieval is provided in the Covid-19 Semantic Browser repository.

References:

[1] A. Conneau et al., Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

[2] N. Reimers et I. Gurevych, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks