BHSo commited on
Commit
b018f55
1 Parent(s): 90e08f5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-sa-3.0
3
+ language: ja
4
+ tags:
5
+ - question-answering
6
+ - extractive-qa
7
+ pipeline_tag:
8
+ - None
9
+ datasets:
10
+ - SkelterLabsInc/JaQuAD
11
+ metrics:
12
+ - Exact match
13
+ - F1 score
14
+ ---
15
+
16
+ # BERT base Japanese - JaQuAD
17
+
18
+ ## Description
19
+
20
+ A Japanese Question Answering model fine-tuned on [JaQuAD](https://huggingface.co/datasets/SkelterLabsInc/JaQuAD).
21
+ Please refer [BERT base Japanese](https://huggingface.co/cl-tohoku/bert-base-japanese) for details about the pre-training model.
22
+ The codes for the fine-tuning are available at [SkelterLabsInc/JaQuAD](https://github.com/SkelterLabsInc/JaQuAD)
23
+
24
+ ## Evaluation results
25
+
26
+ On the development set.
27
+
28
+ ```shell
29
+ {"f1": 77.35, "exact_match": 61.01}
30
+ ```
31
+
32
+ On the test set.
33
+
34
+ ```shell
35
+ {"f1": 78.92, "exact_match": 63.38}
36
+ ```
37
+
38
+ ## Usage
39
+
40
+ ```python
41
+ from transformers import AutoModelForQuestionAnswering, AutoTokenizer
42
+
43
+ question = 'アレクサンダー・グラハム・ベルは、どこで生まれたの?'
44
+ context = 'アレクサンダー・グラハム・ベルは、スコットランド生まれの科学者、発明家、工学者である。世界初の>実用的電話の発明で知られている。'
45
+
46
+ model = AutoModelForQuestionAnswering.from_pretrained('SkelterLabsInc/bert-base-japanese-jaquad')
47
+ tokenizer = AutoTokenizer.from_pretrained('SkelterLabsInc/bert-base-japanese-jaquad')
48
+
49
+ inputs = tokenizer(question, context, add_special_tokens=True, return_tensors="pt")
50
+ input_ids = inputs["input_ids"].tolist()[0]
51
+ outputs = model(**inputs)
52
+ answer_start_scores = outputs.start_logits
53
+ answer_end_scores = outputs.end_logits
54
+
55
+ # Get the most likely beginning of answer with the argmax of the score
56
+ answer_start = torch.argmax(answer_start_scores)
57
+ # Get the most likely end of answer with the argmax of the score
58
+ answer_end = torch.argmax(answer_end_scores) + 1
59
+
60
+ answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))
61
+ # answer: 'スコットランド'
62
+ ```
63
+
64
+ ## License
65
+
66
+ The fine-tuned model is licensed under the [CC BY-SA 3.0](https://creativecommons.org/licenses/by-sa/3.0/) license.
67
+
68
+ ## Citation
69
+
70
+ TBA