amir7d0 commited on
Commit
8154c76
1 Parent(s): 6fc51b4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -45
README.md CHANGED
@@ -10,16 +10,16 @@ model-index:
10
  type: text-classification
11
  name: Text Classification
12
  dataset:
 
13
  name: amazon_reviews_multi
14
- type: amazon_reviews_multi22
15
  split: test
16
  metrics:
17
  - type: accuracy
18
- value: .85
19
  name: Accuracy
20
 
21
  - type: loss
22
- value: 0.1
23
  name: loss
24
 
25
  tags:
@@ -38,7 +38,7 @@ pipeline_tag: text-classification
38
  - [Table of Contents](#table-of-contents)
39
  - [Model Details](#model-details)
40
  - [Uses](#uses)
41
- - [Training Details](#training-details)
42
  - [Evaluation](#evaluation)
43
  - [Framework versions](#framework-versions)
44
 
@@ -61,66 +61,59 @@ This model reaches an accuracy of xxx on the dev set.
61
 
62
  # Uses
63
 
64
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
65
 
66
- ## Direct Use
67
-
68
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
69
- <!-- If the user enters content, print that. If not, but they enter a task in the list, use that. If neither, say "more info needed." -->
70
  ```
71
- from transformers import DistilBertTokenizer, TFDistilBertModel
72
 
73
  checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
74
- tokenizer = DistilBertTokenizer.from_pretrained(checkpoint)
75
- model = TFDistilBertModel.from_pretrained(checkpoint)
76
- text = "xxxxxxxxxxxxxxxxxxxxxxxxxx"
77
- encoded_input = tokenizer(text, return_tensors="tf")
78
- output = model(encoded_input)
 
79
 
 
 
 
80
 
 
 
 
81
  ```
82
 
83
 
84
-
85
  # Training Details
86
 
87
- ## Training Data
88
-
89
- <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
90
-
91
- train data [amazon_reviews_multi](https://huggingface.co/datasets/amazon_reviews_multi)
92
-
93
-
94
- # Evaluation
95
-
96
- <!-- This section describes the evaluation protocols and provides the results. -->
97
-
98
- ## Testing Data, Factors & Metrics
99
-
100
- ### Testing Data
101
-
102
- <!-- This should link to a Data Card if possible. -->
103
-
104
- [amazon_reviews_multi](https://huggingface.co/datasets/amazon_reviews_multi)
105
-
106
 
107
- ### Factors
 
108
 
109
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
110
 
111
- acc
112
- f1
113
- precision
114
 
115
- ### Metrics
 
 
 
 
 
116
 
117
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
118
 
119
- metric1
120
 
121
- ## Results
 
 
 
 
 
 
122
 
123
- result1
124
 
125
 
126
  # Framework versions
 
10
  type: text-classification
11
  name: Text Classification
12
  dataset:
13
+ type: amazon-reviews-multi
14
  name: amazon_reviews_multi
 
15
  split: test
16
  metrics:
17
  - type: accuracy
18
+ value: .80
19
  name: Accuracy
20
 
21
  - type: loss
22
+ value: 0.5
23
  name: loss
24
 
25
  tags:
 
38
  - [Table of Contents](#table-of-contents)
39
  - [Model Details](#model-details)
40
  - [Uses](#uses)
41
+ - [Fine-tuning hyperparameters](#training-details)
42
  - [Evaluation](#evaluation)
43
  - [Framework versions](#framework-versions)
44
 
 
61
 
62
  # Uses
63
 
64
+ You can use this model directly with a pipeline for text classification.
65
 
 
 
 
 
66
  ```
67
+ from transformers import pipeline
68
 
69
  checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
70
+ classifier = pipeline("text-classification", model=checkpoint)
71
+ classifier(["Replace me by any text you'd like."])
72
+ ```
73
+ and in TensorFlow:
74
+ ```
75
+ from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
76
 
77
+ checkpoint = "amir7d0/distilbert-base-uncased-finetuned-amazon-reviews"
78
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
79
+ model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)
80
 
81
+ text = "Replace me by any text you'd like."
82
+ encoded_input = tokenizer(text, return_tensors='tf')
83
+ output = model(encoded_input)
84
  ```
85
 
86
 
 
87
  # Training Details
88
 
89
+ ## Training and Evaluation Data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
+ Here is the raw dataset ([amazon_reviews_multi](https://huggingface.co/datasets/amazon_reviews_multi)) we used for finetuning the model.
92
+ The dataset contains 200,000, 5,000, and 5,000 reviews in the training, dev, and test sets respectively.
93
 
94
+ ## Fine-tuning hyperparameters
95
 
96
+ The following hyperparameters were used during training:
 
 
97
 
98
+ + learning_rate: 2e-05
99
+ + train_batch_size: 16
100
+ + eval_batch_size: 16
101
+ + seed: 42
102
+ + optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
103
+ + num_epochs: 5
104
 
 
105
 
106
+ ### Training results
107
 
108
+ | Epoch | Training Loss | Validation Loss | Accuracy |
109
+ |:-----:|:-------------:|:---------------:|:--------:|
110
+ | 1 | 123 | 123 | 123 |
111
+ | 2 | 123 | 123 | 123 |
112
+ | 3 | 231 | 123 | 123 |
113
+ | 4 | 123 | 123 | 123 |
114
+ | 5 | 123 | 123 | 123 |
115
 
116
+ ## Results
117
 
118
 
119
  # Framework versions