Edit model card

my_distilbert_model

This model is a fine-tuned version of distilbert-base-uncased on an IMDB movie review dataset. It is based on the DistilBERT architecture, which is a distilled version of the BERT model, optimized for speed and efficiency while retaining good performance.

Model description

This model is a sentiment analysis classifier trained on the IMDB Movie Review dataset. It is based on the DistilBERT architecture, which is a distilled version of the BERT model, optimized for speed and efficiency while retaining good performance.

Intended uses & limitations

This model can be used for sentiment analysis tasks, such as classifying the sentiment of movie reviews as positive or negative. It is limited to the English language and may not perform well on text from other languages or domains.

Training and evaluation data

The model was trained on the IMDB Movie Review dataset, which consists of 50,000 movie reviews labeled as positive or negative.

Training procedure

  • Define and compile the BERT and DistilBERT models for sentiment classification.
  • Configure the model's different hyperparameters (learning rate, batch size, number of epochs) to find an optimal configuration.
  • TensorFlow Datasets: Training and testing data were converted into TensorFlow datasets for efficient training.
  • Early Stopping: Early stopping was implemented to prevent overfitting by monitoring validation loss.
  • Compile the model
  • The models were trained on the prepared training dataset in batches.

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 2e-05, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
  • training_precision: float32

Training results

  • Accuracy: 91% of the reviews in the test set were classified correctly. This is similar to the accuracy achieved by the BERT model (91%).
  • Precision:
    • Negative Class: 89% of the predicted negative reviews were actually negative.
    • Positive Class: 93% of the predicted positive reviews were actually positive.
  • Recall:
    • Negative Class: 93% of the actual negative reviews were correctly classified.
    • Positive Class: 89% of the actual positive reviews were correctly classified.
  • F1-Score: The F1-score for both classes is around 0.91, indicating a good balance between precision and recall.

Framework versions

  • Transformers 4.38.2
  • TensorFlow 2.15.1
  • Tokenizers 0.15.2
Downloads last month
2
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from

Space using MariamKili/my_distilbert_model 1