SajjadAyoubi commited on
Commit
8e071d1
1 Parent(s): 80b9388

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -0
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### How to use
2
+ #### Requirements
3
+
4
+ Transformers require `transformers` and `sentencepiece`, both of which can be
5
+ installed using `pip`.
6
+
7
+ ```sh
8
+ pip install transformers sentencepiece
9
+ ```
10
+
11
+ #### Pipelines 馃殌
12
+
13
+ In case you are not familiar with Transformers, you can use pipelines instead.
14
+
15
+ Note that, pipelines can't have _no answer_ for the questions.
16
+
17
+ ```python
18
+ from transformers import pipeline
19
+
20
+ model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
21
+ qa_pipeline = pipeline("question-answering", model=model_name, tokenizer=model_name)
22
+
23
+ text = "爻賱丕賲 賲賳 爻噩丕丿 丕蹖賵亘蹖 賴爻鬲賲 鄄郯 爻丕賱賲賴 賵 亘賴 倬乇丿丕夭卮 夭亘丕賳 胤亘蹖毓蹖 毓賱丕賯賴 丿丕乇賲"
24
+ questions = ["丕爻賲賲 趩蹖賴責", "趩賳丿 爻丕賱賲賴責", "亘賴 趩蹖 毓賱丕賯賴 丿丕乇賲責"]
25
+
26
+ for question in questions:
27
+ print(qa_pipeline({"context": text, "question": question}))
28
+
29
+ >>> {'score': 0.4839823544025421, 'start': 8, 'end': 18, 'answer': '爻噩丕丿 丕蹖賵亘蹖'}
30
+ >>> {'score': 0.3747948706150055, 'start': 24, 'end': 32, 'answer': '鄄郯 爻丕賱賲賴'}
31
+ >>> {'score': 0.5945395827293396, 'start': 38, 'end': 55, 'answer': '倬乇丿丕夭卮 夭亘丕賳 胤亘蹖毓蹖'}
32
+ ```
33
+
34
+ #### Manual approach 馃敟
35
+
36
+ Using the Manual approach, it is possible to have _no answer_ with even better
37
+ performance.
38
+
39
+ - PyTorch
40
+
41
+ ```python
42
+ from transformers import AutoTokenizer, AutoModelForQuestionAnswering
43
+ from src.utils import AnswerPredictor
44
+
45
+ model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
46
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
47
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
48
+
49
+ text = "爻賱丕賲 賲賳 爻噩丕丿 丕蹖賵亘蹖 賴爻鬲賲 鄄郯 爻丕賱賲賴 賵 亘賴 倬乇丿丕夭卮 夭亘丕賳 胤亘蹖毓蹖 毓賱丕賯賴 丿丕乇賲"
50
+ questions = ["丕爻賲賲 趩蹖賴責", "趩賳丿 爻丕賱賲賴責", "亘賴 趩蹖 毓賱丕賯賴 丿丕乇賲責"]
51
+
52
+ # this class is from src/utils.py and you can read more about it
53
+ predictor = AnswerPredictor(model, tokenizer, device="cpu", n_best=10)
54
+ preds = predictor(questions, [text] * 3, batch_size=3)
55
+
56
+ for k, v in preds.items():
57
+ print(v)
58
+ ```
59
+
60
+ Produces an output such below:
61
+ ```
62
+ 100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 1/1 [00:00<00:00, 3.56it/s]
63
+ {'score': 8.040637016296387, 'text': '爻噩丕丿 丕蹖賵亘蹖'}
64
+ {'score': 9.901972770690918, 'text': '鄄郯'}
65
+ {'score': 12.117212295532227, 'text': '倬乇丿丕夭卮 夭亘丕賳 胤亘蹖毓蹖'}
66
+ ```
67
+
68
+ - TensorFlow 2.X
69
+
70
+ ```python
71
+ from transformers import AutoTokenizer, TFAutoModelForQuestionAnswering
72
+ from src.utils import TFAnswerPredictor
73
+
74
+ model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
75
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
76
+ model = TFAutoModelForQuestionAnswering.from_pretrained(model_name)
77
+
78
+ text = "爻賱丕賲 賲賳 爻噩丕丿 丕蹖賵亘蹖 賴爻鬲賲 鄄郯 爻丕賱賲賴 賵 亘賴 倬乇丿丕夭卮 夭亘丕賳 胤亘蹖毓蹖 毓賱丕賯賴 丿丕乇賲"
79
+ questions = ["丕爻賲賲 趩蹖賴責", "趩賳丿 爻丕賱賲賴責", "亘賴 趩蹖 毓賱丕賯賴 丿丕乇賲責"]
80
+
81
+ # this class is from src/utils.py, you can read more about it
82
+ predictor = TFAnswerPredictor(model, tokenizer, n_best=10)
83
+ preds = predictor(questions, [text] * 3, batch_size=3)
84
+
85
+ for k, v in preds.items():
86
+ print(v)
87
+ ```
88
+
89
+ Produces an output such below:
90
+
91
+ ```text
92
+ 100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 1/1 [00:00<00:00, 3.56it/s]
93
+ {'score': 8.040637016296387, 'text': '爻噩丕丿 丕蹖賵亘蹖'}
94
+ {'score': 9.901972770690918, 'text': '鄄郯'}
95
+ {'score': 12.117212295532227, 'text': '倬乇丿丕夭卮 夭亘丕賳 胤亘蹖毓蹖'}
96
+ ```
97
+
98
+ Or you can access the whole demonstration using [HowToUse iPython Notebook on Google Colab](https://colab.research.google.com/github/sajjjadayobi/PersianQA/blob/main/notebooks/HowToUse.ipynb)