hiroki-rad commited on
Commit
0ec8d85
1 Parent(s): bdf83d5

update readme

Browse files
Files changed (1) hide show
  1. README.md +70 -9
README.md CHANGED
@@ -1,6 +1,16 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
  # Model Card for Model ID
@@ -17,13 +27,13 @@ tags: []
17
 
18
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
 
20
- - **Developed by:** [More Information Needed]
21
  - **Funded by [optional]:** [More Information Needed]
22
  - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
  - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
  ### Model Sources [optional]
29
 
@@ -39,9 +49,60 @@ This is the model card of a 🤗 transformers model that has been pushed on the
39
 
40
  ### Direct Use
41
 
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
  ### Downstream Use [optional]
47
 
@@ -77,7 +138,7 @@ Use the code below to get started with the model.
77
 
78
  ### Training Data
79
 
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
 
82
  [More Information Needed]
83
 
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - code
5
+ datasets:
6
+ - elyza/ELYZA-tasks-100
7
+ language:
8
+ - ja
9
+ metrics:
10
+ - accuracy
11
+ base_model:
12
+ - tohoku-nlp/bert-base-japanese-v3
13
+ pipeline_tag: text-classification
14
  ---
15
 
16
  # Model Card for Model ID
 
27
 
28
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
29
 
30
+ - **Developed by:** [Hiroki Yanagisawa]
31
  - **Funded by [optional]:** [More Information Needed]
32
  - **Shared by [optional]:** [More Information Needed]
33
+ - **Model type:** [BERT]
34
+ - **Language(s) (NLP):** [Japanese]
35
  - **License:** [More Information Needed]
36
+ - **Finetuned from model [optional]:** [cl-tohoku/bert-base-japanese-v3]
37
 
38
  ### Model Sources [optional]
39
 
 
49
 
50
  ### Direct Use
51
 
52
+ from transformers import pipeline
53
+
54
+ このlabel2idで学習しました。label2idはこれを利用してください。
55
+ label2id = {'Task_Solution': 0,
56
+ 'Creative_Generation': 1,
57
+ 'Knowledge_Explanation': 2,
58
+ 'Analytical_Reasoning': 3,
59
+ 'Information_Extraction': 4,
60
+ 'Step_by_Step_Calculation': 5,
61
+ 'Role_Play_Response': 6,
62
+ 'Opinion_Perspective': 7}
63
+
64
+ def preprocess_text_classification(examples: dict[str, list]) -> BatchEncoding:
65
+ """バッチ処理用に修正"""
66
+ encoded_examples = tokenizer(
67
+ examples["questions"], # バッチ処理なのでリストで渡される
68
+ max_length=512,
69
+ padding=True,
70
+ truncation=True,
71
+ return_tensors=None # バッチ処理時はNoneを指定
72
+ )
73
+
74
+ # ラベルをバッチで数値に変換
75
+ encoded_examples["labels"] = [label2id[label] for label in examples["labels"]]
76
+ return encoded_examples
77
+
78
+ ##使用するデータセット
79
+ test_data = test_data.to_pandas()
80
+ test_data["labels"] = test_data["labels"].apply(lambda x: label2id[x])
81
+ test_data
82
+
83
+ model_name = "hiroki-rad/bert-base-classification-ft"
84
+ classify_pipe = pipeline(model=model_name, device="cuda:0")
85
+
86
+ class_label = dataset["labels"].unique()
87
+ label2id = {label: id for id, label in enumerate(class_label)}
88
+ id2label = {id: label for id, label in enumerate(class_label)}
89
+
90
+ results: list[dict[str, float | str]] = []
91
+
92
+ for i, example in tqdm(enumerate(test_data.itertuples())):
93
+ # モデルの予測結果を取得
94
+ model_prediction = classify_pipe(example.questions)[0]
95
+ # 正解のラベルIDをラベル名に変換
96
+ true_label = id2label[example.labels]
97
+
98
+ results.append(
99
+ {
100
+ "example_id": i,
101
+ "pred_prob": model_prediction["score"],
102
+ "pred_label": model_prediction["label"],
103
+ "true_label": true_label,
104
+ }
105
+ )
106
 
107
  ### Downstream Use [optional]
108
 
 
138
 
139
  ### Training Data
140
 
141
+ <!https://huggingface.co/datasets/elyza/ELYZA-tasks-100>
142
 
143
  [More Information Needed]
144