Edit model card

Model Card for distilbert-base-uncased-finetuned-amazon-reviews

Table of Contents

Model Details

Model Description

This model is a fine-tuned version of distilbert-base-uncased on amazon_reviews_multi dataset. This model reaches an accuracy of xxx on the dev set.

  • Model type: Language model
  • Language(s) (NLP): en
  • License: apache-2.0
  • Parent Model: For more details about DistilBERT, check out this model card.
  • Resources for more information:

Uses

You can use this model directly with a pipeline for text classification.

from transformers import pipeline

checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
classifier = pipeline("text-classification", model=checkpoint)
classifier(["Replace me by any text you'd like."])

and in TensorFlow:

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)

text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)

Training Details

Training and Evaluation Data

Here is the raw dataset (amazon_reviews_multi) we used for finetuning the model. The dataset contains 200,000, 5,000, and 5,000 reviews in the training, dev, and test sets respectively.

Fine-tuning hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Accuracy

The fine-tuned model was evaluated on the test set of amazon_reviews_multi.

  • Accuracy (exact) is the exact match of the number of stars.
  • Accuracy (off-by-1) is the percentage of reviews where the number of stars the model predicts differs by a maximum of 1 from the number given by the human reviewer.
Split Accuracy (exact) Accuracy (off-by-1)
Dev set 56.96% 85.50%
Test set 57.36% 85.58%

Framework versions

  • Transformers 4.26.1
  • TensorFlow 2.11.0
  • Datasets 2.1.0
  • Tokenizers 0.13.2
Downloads last month
24
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train amir7d0/distilbert-base-uncased-finetuned-amazon-reviews

Space using amir7d0/distilbert-base-uncased-finetuned-amazon-reviews 1

Evaluation results