Back to all models
fill-mask mask_token: [MASK]
Query this model
🔥 This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚡️ Upgrade your account to access the Inference API

							$
							curl -X POST \
-H "Authorization: Bearer YOUR_ORG_OR_USER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '"json encoded string"' \
https://api-inference.huggingface.co/models/NeuML/bert-small-cord19
Share Copied link to clipboard

Monthly model downloads

NeuML/bert-small-cord19 NeuML/bert-small-cord19
24 downloads
last 30 days

pytorch

tf

Contributed by

NeuML company
1 team member · 3 models

How to use this model directly from the 🤗/transformers library:

			
Copy to clipboard
from transformers import AutoTokenizer, AutoModelWithLMHead tokenizer = AutoTokenizer.from_pretrained("NeuML/bert-small-cord19") model = AutoModelWithLMHead.from_pretrained("NeuML/bert-small-cord19")

BERT-Small fine-tuned on CORD-19 dataset

BERT L6_H-512_A-8 model fine-tuned on the CORD-19 dataset.

CORD-19 data subset

The training data for this dataset is stored as a Kaggle dataset. The training data is a subset of the full corpus, focusing on high-quality, study-design detected articles.

Building the model

python run_language_modeling.py
    --model_type bert
    --model_name_or_path google/bert_uncased_L-6_H-512_A-8
    --do_train
    --mlm
    --line_by_line
    --block_size 512
    --train_data_file cord19.txt
    --per_gpu_train_batch_size 4
    --learning_rate 3e-5
    --num_train_epochs 3.0
    --output_dir bert-small-cord19
    --save_steps 0
    --overwrite_output_dir