File size: 1,369 Bytes
740c68a
844af8e
 
 
 
 
740c68a
844af8e
 
740c68a
fd93f82
844af8e
 
 
96e53f1
fd93f82
 
eb05a42
aa2adc1
 
eb05a42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
635ef5c
eb05a42
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
language: 
  - tw
tags:
- albert
- classification
license: afl-3.0
metrics:
- Accuracy
---

# 繁體中文情緒分類: 負面(0)、正面(1)  

依據ckiplab/albert預訓練模型微調,訓練資料集只有8萬筆,做為課程的範例模型。

# 使用範例:

    from transformers import AutoTokenizer, AutoModelForSequenceClassification
    tokenizer = AutoTokenizer.from_pretrained("clhuang/albert-sentiment")
    model = AutoModelForSequenceClassification.from_pretrained("clhuang/albert-sentiment")
    
    ## Pediction
    target_names=['Negative','Positive']
    max_length = 200 # 最多字數 若超出模型訓練時的字數,以模型最大字數為依據 
    def get_sentiment_proba(text):
        # prepare our text into tokenized sequence
        inputs = tokenizer(text, padding=True, truncation=True, max_length=max_length, return_tensors="pt")
        # perform inference to our model
        outputs = model(**inputs)
        # get output probabilities by doing softmax
        probs = outputs[0].softmax(1)
    
        response = {'Negative': round(float(probs[0, 0]), 2), 'Positive': round(float(probs[0, 1]), 2)}
        # executing argmax function to get the candidate label
        #return probs.argmax()
        return response
    
    get_sentiment_proba('我喜歡這本書')
    get_sentiment_proba('不喜歡這款產品')