license: mit | |
# DistilBERT | |
The DistilBERT model is a [BERT](https://huggingface.co/distilbert/distilbert-base-uncased) model fine-tuned on the | |
[NewsQA](https://huggingface.co/datasets/lucadiliello/newsqa) dataset. | |
## Hyperparameters | |
``` | |
batch_size = 16 | |
n_epochs = 3 | |
max_seq_len = 512 | |
learning_rate = 2e-5 | |
optimizer=AdamW | |
lr_schedule = LinearWarmup | |
weight_decay=0.01 | |
embeds_dropout_prob = 0.1 | |
``` |