The model uses the original
scivocab wordpiece vocabulary and was trained using the average pooling strategy and a softmax loss.
allenai/scibert-scivocab-cased from HuggingFace's
Training time: ~4 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.
|Max. Seq. Length||128|
Performances: The performance was evaluated on the test portion of the STS dataset using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.
An example usage for similarity-based scientific paper retrieval is provided in the Covid Papers Browser repository.
 I. Beltagy et al, SciBERT: A Pretrained Language Model for Scientific Text
 N. Reimers et I. Gurevych, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks