Edit model card

This model is a fine-tuned version of distilbert-base-uncased, tailored specifically for sentiment analysis. DistilBERT, a distilled version of the more complex BERT model, offers a good balance between performance and resource efficiency, making it ideal for environments where computational resources are limited.

Purpose: The primary purpose of this fine-tuned model is to perform sentiment analysis on English movie reviews. It classifies text into positive or negative sentiments based on the content of the review. This model has been trained and evaluated on a subset of the IMDb reviews dataset, making it particularly well-suited for analyzing movie review sentiments.

results

This section outlines the performance of the model on the sentiment analysis task using the IMDb movie reviews dataset, both before and after the fine-tuning process. The results highlight the effectiveness of fine-tuning in enhancing model accuracy and generalization.

Pre-Fine-Tuning Evaluation:

Before fine-tuning, the model was evaluated on the IMDb dataset to establish a baseline for its performance. The initial evaluation yielded the following metrics:

Loss: 0.6518

Evaluation Runtime: 42.6174 seconds

Samples per Second: 58.662

These results indicate the model's performance with the original distilbert-base-uncased training, without any adjustments for the specific task of sentiment analysis on movie reviews.

Post-Fine-Tuning Evaluation:

After fine-tuning, the model was re-evaluated on the same dataset to assess improvements from the training adjustments:

Loss: 6.091e-06

Evaluation Runtime: 39.3821 seconds

Samples per Second: 63.481

The significant reduction in loss demonstrates a substantial increase in the model's accuracy and its capability to correctly classify sentiment in movie reviews. The improvement in processing speed (samples per second) also suggests enhanced efficiency post-fine-tuning.

Discussion:

The dramatic decrease in evaluation loss post-fine-tuning highlights the effectiveness of adapting the DistilBERT model to a specific dataset and task. This adaptation has markedly improved the model's predictive accuracy, making it a valuable tool for applications involving sentiment analysis of English text, particularly movie reviews.

These results illustrate the potential of fine-tuning pre-trained models on specific subsets of data to enhance their applicability to specialized tasks.

Model description

This model is a fine-tuned version of distilbert-base-uncased, tailored specifically for sentiment analysis. DistilBERT, a distilled version of the more complex BERT model, offers a good balance between performance and resource efficiency, making it ideal for environments where computational resources are limited.

Intended uses & limitations

This model is intended for use in NLP applications where sentiment analysis of English movie reviews is required. It can be easily integrated into applications for analyzing customer feedback, conducting market research, or enhancing user experience by understanding sentiments expressed in text.

The current model is specifically tuned for sentiments in movie reviews and may not perform as well when used on texts from other domains. Additionally, the model's performance might vary depending on the nature of the text, such as informal language or idioms that were not prevalent in the training data.

Training and evaluation data

The model was fine-tuned using the IMDb movie reviews dataset available through HuggingFace's datasets library. This dataset comprises 50,000 highly polar movie reviews split evenly into training and test sets, providing rich text data for training sentiment analysis models. For the purpose of fine-tuning, only 10% of the training set was used to expedite the training process while maintaining a representative sample of the data.

Training procedure

The fine-tuning was performed on Google Colab, utilizing the pre-configured DistilBERT model loaded from HuggingFace's transformers library. The model was fine-tuned for 3 epochs with a batch size of 8 and a learning rate of 5e-5. Special care was taken to maintain the integrity of the tokenization using DistilBERT's default tokenizer, ensuring that the input data was appropriately pre-processed.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.0003 1.0 313 0.0002
0.0 2.0 626 0.0000
0.0 3.0 939 0.0000

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
67M params
Tensor type
F32
·

Finetuned from