Mizuiro-sakura
commited on
Commit
•
61f41b6
1
Parent(s):
e8b94f5
Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,8 @@ license: mit
|
|
6 |
夏目漱石さんの文章(こころ、坊ちゃん、三四郎、etc)を日本語極性辞書
|
7 |
( http://www.cl.ecei.tohoku.ac.jp/Open_Resources-Japanese_Sentiment_Polarity_Dictionary.html )
|
8 |
を用いてポジティブ・ネガティブ判定したものを教師データとしてモデルの学習を行いました。
|
|
|
|
|
9 |
|
10 |
# This model is based on Luke-japanese-base-lite
|
11 |
This model was fine-tuned model which besed on studio-ousia/Luke-japanese-base-lite.
|
@@ -19,6 +21,35 @@ LUKE (Language Understanding with Knowledge-based Embeddings) is a new pre-train
|
|
19 |
LUKE achieves state-of-the-art results on five popular NLP benchmarks including SQuAD v1.1 (extractive question answering), CoNLL-2003 (named entity recognition), ReCoRD (cloze-style question answering), TACRED (relation classification), and Open Entity (entity typing).
|
20 |
|
21 |
# how to use 使い方
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
# Citation
|
24 |
[1]@inproceedings{yamada2020luke,
|
|
|
6 |
夏目漱石さんの文章(こころ、坊ちゃん、三四郎、etc)を日本語極性辞書
|
7 |
( http://www.cl.ecei.tohoku.ac.jp/Open_Resources-Japanese_Sentiment_Polarity_Dictionary.html )
|
8 |
を用いてポジティブ・ネガティブ判定したものを教師データとしてモデルの学習を行いました。
|
9 |
+
比較的長い文章(30語以上)において高い精度を発揮します。(単語など短い文章では低い正答率であることが確認されています。)
|
10 |
+
また使用した教師データから、口語より文語に対して高い正答率となることが期待されます。
|
11 |
|
12 |
# This model is based on Luke-japanese-base-lite
|
13 |
This model was fine-tuned model which besed on studio-ousia/Luke-japanese-base-lite.
|
|
|
21 |
LUKE achieves state-of-the-art results on five popular NLP benchmarks including SQuAD v1.1 (extractive question answering), CoNLL-2003 (named entity recognition), ReCoRD (cloze-style question answering), TACRED (relation classification), and Open Entity (entity typing).
|
22 |
|
23 |
# how to use 使い方
|
24 |
+
-------------------------------------------------------------
|
25 |
+
|
26 |
+
import torch
|
27 |
+
from transformers import MLukeTokenizer
|
28 |
+
from torch import nn
|
29 |
+
|
30 |
+
tokenizer = MLukeTokenizer.from_pretrained('studio-ousia/luke-japanese-base-lite')
|
31 |
+
model = torch.load('C:\\[My_luke_model_pn.pthのあるディレクトリ]\\My_luke_model_pn.pth')
|
32 |
+
|
33 |
+
text=input()
|
34 |
+
|
35 |
+
encoded_dict = tokenizer.encode_plus(
|
36 |
+
text,
|
37 |
+
return_attention_mask = True, # Attention maksの作成
|
38 |
+
return_tensors = 'pt', # Pytorch tensorsで返す
|
39 |
+
)
|
40 |
+
|
41 |
+
pre = model(encoded_dict['input_ids'], token_type_ids=None, attention_mask=encoded_dict['attention_mask'])
|
42 |
+
SOFTMAX=nn.Softmax(dim=0)
|
43 |
+
num=SOFTMAX(pre.logits[0])
|
44 |
+
if num[1]>0.5:
|
45 |
+
print(str(num[1]))
|
46 |
+
print('ポジティブ')
|
47 |
+
else:
|
48 |
+
print(str(num[1]))
|
49 |
+
print('ネガティブ')
|
50 |
+
|
51 |
+
|
52 |
+
-------------------------------------------------------------
|
53 |
|
54 |
# Citation
|
55 |
[1]@inproceedings{yamada2020luke,
|