Transformers
Back to all models
Model: gsarti/scibert-nli

Monthly model downloads

gsarti/scibert-nli gsarti/scibert-nli
- downloads
last 30 days

pytorch

tf

Contributed by

gsarti Gabriele Sarti
3 models

How to use this model directly from the 🤗/transformers library:

			
Copy model
tokenizer = AutoTokenizer.from_pretrained("gsarti/scibert-nli") model = AutoModelWithLMHead.from_pretrained("gsarti/scibert-nli")

SciBERT-NLI

This is the model SciBERT [1] fine-tuned on the SNLI and the MultiNLI datasets using the sentence-transformers library to produce universal sentence embeddings [2].

The model uses the original scivocab wordpiece vocabulary and was trained using the average pooling strategy and a softmax loss.

Base model: allenai/scibert-scivocab-cased from HuggingFace's AutoModel.

Training time: ~4 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.

Parameters:

Parameter Value
Batch size 64
Training steps 20000
Warmup steps 1450
Lowercasing True
Max. Seq. Length 128

Performances: The performance was evaluated on the test portion of the STS dataset using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.

Model Score
scibert-nli (this) 74.50
bert-base-nli-mean-tokens[3] 77.12

An example usage for similarity-based scientific paper retrieval is provided in the Covid Papers Browser repository.

References:

[1] I. Beltagy et al, SciBERT: A Pretrained Language Model for Scientific Text

[2] A. Conneau et al., Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

[3] N. Reimers et I. Gurevych, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks