--- license: apache-2.0 base_model: distilbert-base-uncased tags: - generated_from_trainer metrics: - f1 datasets: - stanfordnlp/imdb language: - en library_name: transformers model-index: - name: movie-review-classifier results: - task: type: text-classification # Required. Example: automatic-speech-recognition dataset: type: standfordnlp/imdb # Required. Example: common_voice. Use dataset id from https://hf.co/datasets name: IMDB Movie Reviews # Required. A pretty name for the dataset. Example: Common Voice (French) metrics: - type: f1 # Required. Example: wer. Use metric id from https://hf.co/metrics value: 0.9327 # Required. Example: 20.90 --- # movie-review-classifier This model classifies (text) movie reviews as either a 1 (*i.e.,* thumbs-up) or a 0 (*i.e.,* a thumbs-down). ## Model description This model is a version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) that was fine-tuned on the [IMDB movie-review dataset](https://huggingface.co/datasets/stanfordnlp/imdb). It achieves the following results on the evaluation set: - Loss: 0.2743 - F1: 0.9327 ## Intended uses & limitations Training this model was completed as part of a project from a data science bootcamp. It is intended to be used perhaps by students and/or hobbyists. ## Training and evaluation data This model was trained on the [IMDB movie-review dataset](https://huggingface.co/datasets/stanfordnlp/imdb), a set of highly polarized (*i.e.,* clearly positive or negative) movie reviews. The dataset contains 25k labelled train samples, 25k labelled test samples, and 50k unlabelled samples. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 - weight_decay: 0.1 ### Training results | Training Loss | Epoch | Step | Validation Loss | F1 | |:-------------:|:-----:|:----:|:---------------:|:------:| | 0.2258 | 1.0 | 1563 | 0.2161 | 0.9122 | | 0.1486 | 2.0 | 3126 | 0.2291 | 0.9306 | | 0.0916 | 3.0 | 4689 | 0.2743 | 0.9327 | ### Framework versions - Transformers 4.42.4 - Pytorch 2.3.1+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1