LiYuan commited on
Commit
a7729c8
1 Parent(s): cca7782
Files changed (1) hide show
  1. README.md +11 -14
README.md CHANGED
@@ -23,17 +23,18 @@ It achieves the following results on the evaluation set:
23
 
24
  ## Model description
25
 
26
- DistilBERT is a transformers model, smaller and faster than BERT, which was pretrained on the same corpus in a
27
- self-supervised fashion, using the BERT base model as a teacher. This means it was pretrained on the raw texts only,
28
- with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic
29
- process to generate inputs and labels from those texts using the BERT base model. We replaced its head with our shopping relevance category to fine-tune it on 571,223 rows of training set while validate it on 142,806 rows of dev set. Finally, we evaluated our model performance on a held-out test set: 79,337 rows.
 
30
 
31
  ## Intended uses & limitations
32
 
33
- DistilBERT is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)
34
- to make decisions, such as sequence classification, token classification, or question answering. This fine-tuned version of DistilBERT is used to predict the relevance between one query and one product description. It also can be used to rerank the relevance order of products given one query for the amazon platform or other e-commerce platforms.
35
 
36
- The limitations are this trained model is focusing on queries and products on Amazon. If you apply this model to other domains, it may perform poorly.
37
 
38
  ## How to use
39
 
@@ -49,12 +50,8 @@ model = AutoModelForSequenceClassification.from_pretrained("LiYuan/amazon-query-
49
 
50
  ## Training and evaluation data
51
 
52
- Download all the raw [dataset](https://www.aicrowd.com/challenges/esci-challenge-for-improving-product-search/dataset_files) from the Amazon KDD Cup website.
53
 
54
- 1. Concatenate the all product attributes from the product dataset
55
- 2. Join it with a training query dataset
56
- 3. Stratified Split the merged data into 571,223-row training, 142,806-row validation, 79,337-row test set
57
- 4. Train on the full training set
58
 
59
 
60
  ## Training procedure
@@ -74,8 +71,8 @@ The following hyperparameters were used during training:
74
 
75
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
76
  |:-------------:|:-----:|:-----:|:---------------:|:--------:|
77
- | 0.8981 | 1.0 | 35702 | 0.8662 | 0.6371 |
78
- | 0.7837 | 2.0 | 71404 | 0.8244 | 0.6617 |
79
 
80
 
81
  ### Framework versions
 
23
 
24
  ## Model description
25
 
26
+ This a bert-base-multilingual-uncased model finetuned for sentiment analysis on product reviews in six languages: English, Dutch, German, French, Spanish and Italian. It predicts the sentiment of the review as a number of stars (between 1 and 5).
27
+
28
+ This model is intended for direct use as a sentiment analysis model for product reviews in any of the six languages above, or for further finetuning on related sentiment analysis tasks.
29
+
30
+ We replaced its head with our customer reviews to fine-tune it on 17,280 rows of training set while validating it on 4,320 rows of dev set. Finally, we evaluated our model performance on a held-out test set: 2,400 rows.
31
 
32
  ## Intended uses & limitations
33
 
34
+ Bert-base is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)
35
+ to make decisions, such as sequence classification, token classification, or question answering. This fine-tuned version of BERT-base is used to predict review rating star given the review.
36
 
37
+ The limitations are this trained model is focusing on reviews and products on Amazon. If you apply this model to other domains, it may perform poorly.
38
 
39
  ## How to use
40
 
 
50
 
51
  ## Training and evaluation data
52
 
53
+ Download all the raw [dataset](https://www.kaggle.com/datasets/cynthiarempel/amazon-us-customer-reviews-dataset) from the Kaggle website.
54
 
 
 
 
 
55
 
56
 
57
  ## Training procedure
 
71
 
72
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
73
  |:-------------:|:-----:|:-----:|:---------------:|:--------:|
74
+ | 0.555400 | 1.0 | 1080 | 0.520294 | 0.800000 |
75
+ | 0.424300 | 2.0 | 1080 | 0.549649 | 0.798380 |
76
 
77
 
78
  ### Framework versions