LiYuan commited on
Commit
68bac3a
1 Parent(s): 217a50a
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -8,6 +8,16 @@ This new Sentence-BERT model is modified on the BERT model by adding a pooling o
8
 
9
  ![1.png](1.png)
10
 
 
 
 
 
 
 
 
 
 
 
11
  As we can observe from above figure, a pooling layer is added on the top of each BERT Model to obtain the sentence embedding $u$ and $v$. Finally, the cosine similarity between $u$ and $v$ can be computed to compare with the true score or make them semantically meaningful, then the mean square error loss, which is the objective function, can be backpropagated through this BERT network model to update the weights.
12
 
13
  In our amazon case, the query is sentence A and concatenated product attributes are sentence B. We also stratified split the merged set into **571,223** rows for training, **500** rows for validation, **3,000** rows for test. We limited the output score between 0 and 1. The following scores represent the degree of relevance between the query and the product attributes in light of Amazon KDD Cup website; however, this can be adjusted to improve the model performance.
8
 
9
  ![1.png](1.png)
10
 
11
+ # Download and Use
12
+
13
+ ```python
14
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
15
+
16
+ tokenizer = AutoTokenizer.from_pretrained("LiYuan/Amazon-Cup-Cross-Encoder-Regression")
17
+
18
+ model = AutoModelForSequenceClassification.from_pretrained("LiYuan/Amazon-Cup-Cross-Encoder-Regression")
19
+ ```
20
+
21
  As we can observe from above figure, a pooling layer is added on the top of each BERT Model to obtain the sentence embedding $u$ and $v$. Finally, the cosine similarity between $u$ and $v$ can be computed to compare with the true score or make them semantically meaningful, then the mean square error loss, which is the objective function, can be backpropagated through this BERT network model to update the weights.
22
 
23
  In our amazon case, the query is sentence A and concatenated product attributes are sentence B. We also stratified split the merged set into **571,223** rows for training, **500** rows for validation, **3,000** rows for test. We limited the output score between 0 and 1. The following scores represent the degree of relevance between the query and the product attributes in light of Amazon KDD Cup website; however, this can be adjusted to improve the model performance.