--- license: mit --- # DistilBERT The DistilBERT model is a [BERT](https://huggingface.co/distilbert/distilbert-base-uncased) model fine-tuned on the [NewsQA](https://huggingface.co/datasets/lucadiliello/newsqa) dataset. ## Hyperparameters ``` batch_size = 16 n_epochs = 3 max_seq_len = 512 learning_rate = 2e-5 optimizer=AdamW lr_schedule = LinearWarmup weight_decay=0.01 embeds_dropout_prob = 0.1 ```