roberta-large-squad_epoch_3
Model description
This is a fine-tuned version of DistilBERT for question answering tasks. The model was trained on SQuAD dataset.
Training procedure
The model was trained with the following hyperparameters:
- Learning Rate: 5e-05
- Batch Size: 8
- Epochs: 3
- Weight Decay: 0.01
Intended uses & limitations
This model is intended to be used for question answering tasks, particularly on SQuAD-like datasets. It performs best on factual questions where the answer can be found as a span of text within the given context.
Training Details
Training Data
The model was trained on the SQuAD dataset, which consists of questions posed by crowdworkers on a set of Wikipedia articles.
Training Hyperparameters
The model was trained with the following hyperparameters:
- learning_rate: 5e-05
- batch_size: 8
- num_epochs: 3
- weight_decay: 0.01
Uses
This model can be used for:
- Extracting answers from text passages given questions
- Question answering tasks
- Reading comprehension tasks
Limitations
- The model can only extract answers that are directly present in the given context
- Performance may vary on out-of-domain texts
- The model may struggle with complex reasoning questions
Additional Information
- Model type: DistilBERT
- Language: English
- License: MIT
- Framework: PyTorch
- Downloads last month
- 31
Dataset used to train clementlemon02/roberta-large-squad_epoch_3
Evaluation results
- Exact Match on SQuADvalidation set self-reportedN/A
- F1 on SQuADvalidation set self-reportedN/A