Matthijs0
/

Distilled-RoBERTa

Question Answering

Inference Endpoints

Model card Files Files and versions Community

Distilled-RoBERTa / README.md

Matthijs0's picture

Update README.md

6fc12df verified 6 months ago

|

history blame contribute delete

467 Bytes

	---
	license: mit
	---
	# Distilled-RoBERTa

	The DistilBERT model is a [RoBERTa](https://huggingface.co/deepset/roberta-base-squad2-distilled) model, which is trained on the SQuAD 2.0 training set, fine-tuned on the [NewsQA](https://huggingface.co/datasets/lucadiliello/newsqa) dataset.

	## Hyperparameters
	```
	batch_size = 16
	n_epochs = 3
	max_seq_len = 512
	learning_rate = 2e-5
	optimizer=AdamW
	lr_schedule = LinearWarmup
	weight_decay=0.01
	embeds_dropout_prob = 0.1
	```