wonrax commited on
Commit
5621bf6
1 Parent(s): 82cd347

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -11,7 +11,35 @@ widget:
11
  - text: "Cái này giá ổn không nhỉ?"
12
 
13
  ---
 
 
 
 
 
 
 
 
14
  Dataset: [30K e-commerce reviews](https://www.kaggle.com/datasets/linhlpv/vietnamese-sentiment-analyst)
15
 
16
- I'll add some more info soon
 
 
 
 
 
 
 
 
 
 
 
 
17
 
 
 
 
 
 
 
 
 
 
11
  - text: "Cái này giá ổn không nhỉ?"
12
 
13
  ---
14
+
15
+ A model fine-tuned for sentiment analysis based on [vinai/phobert-base](https://huggingface.co/vinai/phobert-base).
16
+
17
+ Labels:
18
+ - NEG: Negative
19
+ - POS: Positive
20
+ - NEU: Neutral
21
+
22
  Dataset: [30K e-commerce reviews](https://www.kaggle.com/datasets/linhlpv/vietnamese-sentiment-analyst)
23
 
24
+ ## Usage
25
+ ```python
26
+ import torch
27
+ from transformers import RobertaForSequenceClassification, AutoTokenizer
28
+
29
+ model = RobertaForSequenceClassification.from_pretrained("wonrax/phobert-base-vietnamese-sentiment")
30
+
31
+ tokenizer = AutoTokenizer.from_pretrained("wonrax/phobert-base-vietnamese-sentiment", use_fast=False)
32
+
33
+ # Just like PhoBERT: INPUT TEXT MUST BE ALREADY WORD-SEGMENTED!
34
+ sentence = 'Đây là mô_hình rất hay , phù_hợp với điều_kiện và như cầu của nhiều người .'
35
+
36
+ input_ids = torch.tensor([tokenizer.encode(sentence)])
37
 
38
+ with torch.no_grad():
39
+ out = model(input_ids)
40
+ print(out.logits.softmax(dim=-1).tolist())
41
+ # Output:
42
+ # [[0.002, 0.988, 0.01]]
43
+ # ^ ^ ^
44
+ # NEG POS NEU
45
+ ```