amir7d0
/

distilbert-base-uncased-finetuned-amazon-reviews

@@ -10,16 +10,16 @@ model-index:
           type: text-classification
           name: Text Classification
         dataset:
           name: amazon_reviews_multi
-          type: amazon_reviews_multi22
           split: test
         metrics:
           - type: accuracy
-            value: .85
             name: Accuracy
           - type: loss
-            value: 0.1
             name: loss
 tags:
@@ -38,7 +38,7 @@ pipeline_tag: text-classification
 - [Table of Contents](#table-of-contents)
 - [Model Details](#model-details)
 - [Uses](#uses)
-- [Training Details](#training-details)
 - [Evaluation](#evaluation)
 - [Framework versions](#framework-versions)
@@ -61,66 +61,59 @@ This model reaches an accuracy of xxx on the dev set.
 # Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-## Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-<!-- If the user enters content, print that. If not, but they enter a task in the list, use that. If neither, say "more info needed." -->
 ```
-from transformers import DistilBertTokenizer, TFDistilBertModel
 checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
-tokenizer = DistilBertTokenizer.from_pretrained(checkpoint)
-model = TFDistilBertModel.from_pretrained(checkpoint)
-text = "xxxxxxxxxxxxxxxxxxxxxxxxxx"
-encoded_input = tokenizer(text, return_tensors="tf")
-output = model(encoded_input)
 ```
 # Training Details
-## Training Data
-<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-train data [amazon_reviews_multi](https://huggingface.co/datasets/amazon_reviews_multi)
-# Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-## Testing Data, Factors & Metrics
-### Testing Data
-<!-- This should link to a Data Card if possible. -->
-[amazon_reviews_multi](https://huggingface.co/datasets/amazon_reviews_multi)
-### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-acc
-f1
-precision
-### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-metric1
-## Results
-result1
 # Framework versions

           type: text-classification
           name: Text Classification
         dataset:
+          type: amazon-reviews-multi
           name: amazon_reviews_multi
           split: test
         metrics:
           - type: accuracy
+            value: .80
             name: Accuracy
           - type: loss
+            value: 0.5
             name: loss
 tags:
 - [Table of Contents](#table-of-contents)
 - [Model Details](#model-details)
 - [Uses](#uses)
+- [Fine-tuning hyperparameters](#training-details)
 - [Evaluation](#evaluation)
 - [Framework versions](#framework-versions)
 # Uses
+You can use this model directly with a pipeline for text classification.
 ```
+from transformers import pipeline
 checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
+classifier = pipeline("text-classification", model=checkpoint)
+classifier(["Replace me by any text you'd like."])
+```
+and in TensorFlow:
+```
+from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
+checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
+tokenizer = AutoTokenizer.from_pretrained(checkpoint)
+model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)
+text = "Replace me by any text you'd like."
+encoded_input = tokenizer(text, return_tensors='tf')
+output = model(encoded_input)
 ```
 # Training Details
+## Training and Evaluation Data
+Here is the raw dataset ([amazon_reviews_multi](https://huggingface.co/datasets/amazon_reviews_multi)) we used for finetuning the model.
+The dataset contains 200,000, 5,000, and 5,000 reviews in the training, dev, and test sets respectively.
+## Fine-tuning hyperparameters
+The following hyperparameters were used during training:
++ learning_rate: 2e-05
++ train_batch_size: 16
++ eval_batch_size: 16
++ seed: 42
++ optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
++ num_epochs: 5
+### Training results
+| Epoch | Training Loss | Validation Loss | Accuracy |
+|:-----:|:-------------:|:---------------:|:--------:|
+|   1   |      123      |       123       |    123   |
+|   2   |      123      |       123       |    123   |
+|   3   |      231      |       123       |    123   |
+|   4   |      123      |       123       |    123   |
+|   5   |      123      |       123       |    123   |
+## Results
 # Framework versions