DistilBERT Fine-Tuned on IMDB for Masked Language Modeling
Model Description
This model is a fine-tuned version of distilbert-base-uncased
for the masked language modeling task. It has been trained on the IMDb dataset.
Model Training Details
Training Dataset
- Dataset: IMDB dataset from Hugging Face
- Dataset Split:
- Train: 25,000 samples
- Test: 25,000 samples
- Unsupervised: 50,000 samples
- Training and Unsupervised Data Concatenation: Training performed on a combined dataset of train and unsupervised splits.
Training Arguments
The following parameters were used during fine-tuning:
- Number of Training Epochs:
10
- Overwrite Output Directory:
True
- Evaluation Strategy:
steps
- Evaluation Steps:
500
- Evaluation Steps:
- Checkpoint Save Strategy:
steps
- Save Steps:
500
- Save Steps:
- Load Best Model at End:
True
- Metric for Best Model:
eval_loss
- Direction: Lower
eval_loss
is better (greater_is_better = False
).
- Direction: Lower
- Learning Rate:
2e-5
- Weight Decay:
0.01
- Per-Device Batch Size (Training):
32
- Per-Device Batch Size (Evaluation):
32
- Warmup Steps:
1,000
- Mixed Precision Training: Enabled (
fp16 = True
) - Logging Steps:
100
- Gradient Accumulation Steps:
2
Early Stopping
- The model was configured with early stopping to prevent overfitting.
- Training stopped after 5.87 epochs (21,000 steps), as there was no significant improvement in
eval_loss
.
Evaluation Results
- Metric Used:
eval_loss
- Final Perplexity:
8.34
- Best Checkpoint: Model saved at the end of early stopping (step
21,000
).
Model Usage
The model can be used for masked language modeling tasks using the fill-mask
pipeline from Hugging Face. Example:
from transformers import pipeline
mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm")
text = "This is a great [MASK]."
predictions = mask_filler(text)
for pred in predictions:
print(f">>> {pred['sequence']}")
Output Example:
>>> This is a great movie.
>>> This is a great film.
>>> This is a great show.
>>> This is a great documentary.
>>> This is a great story.
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Prikshit7766/distilbert-finetuned-imdb-mlm
Base model
distilbert/distilbert-base-uncased