pedramyazdipoor commited on
Commit
b7ee67c
1 Parent(s): 7730914

Upload TFBertForQuestionAnswering

Browse files
Files changed (3) hide show
  1. README.md +41 -112
  2. config.json +2 -2
  3. tf_model.h5 +3 -0
README.md CHANGED
@@ -1,117 +1,46 @@
1
- ## ParsBert For Question Answering Task
 
 
 
 
 
 
2
 
3
- ParsBERT is a monolingual language model based on Google’s BERT architecture with the same configurations as BERT-Base.
4
- In this project I fine-tune ParsBert for extractive question answering task on PQuAD dataset.
5
 
6
- Paper presenting ParsBERT: [arXiv:2005.12515](https://arxiv.org/abs/2005.12515)
7
 
8
- Paper presenting PQuAD dataset: [arXiv:2202.06219](https://arxiv.org/abs/2202.06219)
 
9
 
10
- ---
11
 
12
- ## Introduction
13
-
14
- This model is fine-tuned on PQuAD Train set and is easily ready to use.
15
- Its very long training time encouraged me to publish this model in order to make life easier for those who need.
16
-
17
-
18
- ## Hyperparameters
19
- I set batch size to 32 due to the limitations of GPU memory in Google Colab.
20
-
21
- ```
22
- batch_size = 32
23
- n_epochs = 2
24
- base_LM_model = 'HooshvareLab/bert-fa-base-uncased'
25
- max_seq_len = 256
26
- learning_rate = 5e-5
27
- ```
28
-
29
- ## Performance
30
- Evaluated on the PQuAD Persian test set with the [official PQuAD link](https://huggingface.co/datasets/newsha/PQuAD).
31
- The model started to get overfitted after 2 epochs with dropout rates between 0.01 to 0.1 and
32
- stopped to learn with rates bigger than 0.1.
33
- [Our XLM-Roberta](https://huggingface.co/pedramyazdipoor/persian_xlm_roberta_large) outperforms our ParsBert on PQuAD dataset, but the former is more than 3 times bigger than the latter one; so comparing these two is not fair.
34
-
35
- ### Question Answering On Test Set of PQuAD Dataset
36
- | Metric | Our XLM-Roberta Large | Our ParsBert |
37
- |:----------------:|:---------------------:|:-------------:|
38
- | Exact Match | 66.56* | 47.44 |
39
- | F1 | 87.31* | 81.96 |
40
-
41
- ## How to use
42
- ## Pytorch
43
- ```python
44
- from transformers import AutoTokenizer, AutoModelForQuestionAnswering, AutoConfig
45
- tokenizer = AutoTokenizer.from_pretrained('pedramyazdipoor/parsbert_question_answering_PQuAD')
46
- model = AutoModelForQuestionAnswering.from_pretrained('pedramyazdipoor/parsbert_question_answering_PQuAD')
47
- config = AutoConfig.from_pretrained('pedramyazdipoor/parsbert_question_answering_PQuAD')
48
- ```
49
-
50
- ## Inference
51
- There are some considerations for inference:
52
- 1) Start index of answer must be smaller than end index.
53
- 2) The span of answer must be within the context.
54
- 3) The selected span must be the most probable choice among N pairs of candidates.
55
-
56
- ```python
57
- def generate_indexes(start_logits, end_logits, N, max_index):
58
-
59
- output_start = start_logits
60
- output_end = end_logits
61
- start_indexes = np.arange(len(start_logits))
62
- start_probs = output_start
63
- list_start = dict(zip(start_indexes, start_probs.tolist()))
64
- end_indexes = np.arange(len(end_logits))
65
- end_probs = output_end
66
- list_end = dict(zip(end_indexes, end_probs.tolist()))
67
- sorted_start_list = sorted(list_start.items(), key=lambda x: x[1], reverse=True) #Descending sort by probability
68
- sorted_end_list = sorted(list_end.items(), key=lambda x: x[1], reverse=True)
69
- final_start_idx, final_end_idx = [[] for l in range(2)]
70
- start_idx, end_idx, prob = 0, 0, (start_probs.tolist()[0] + end_probs.tolist()[0])
71
- for a in range(0,N):
72
- for b in range(0,N):
73
- if (sorted_start_list[a][1] + sorted_end_list[b][1]) > prob :
74
- if (sorted_start_list[a][0] <= sorted_end_list[b][0]) and (sorted_end_list[a][0] < max_index) :
75
- prob = sorted_start_list[a][1] + sorted_end_list[b][1]
76
- start_idx = sorted_start_list[a][0]
77
- end_idx = sorted_end_list[b][0]
78
- final_start_idx.append(start_idx)
79
- final_end_idx.append(end_idx)
80
- return final_start_idx[0], final_end_idx[0]
81
- ```
82
- ```python
83
- device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
84
- model.eval().to(device)
85
-
86
- text = 'اسمم پدرامه.'
87
- question = 'اسمم چیه؟'
88
- print(tokenizer.tokenize(text + question))
89
-
90
- encoding = tokenizer(text,question,add_special_tokens = True,
91
- return_token_type_ids = True,
92
- return_tensors = 'pt',
93
- padding = True,
94
- return_offsets_mapping = True,
95
- truncation = 'only_first',
96
- max_length = 32)
97
-
98
- out = model(encoding['input_ids'].to(device),encoding['attention_mask'].to(device), encoding['token_type_ids'].to(device))
99
- #we had to change some pieces of code to make it compatible with one answer generation at a time.
100
- #you can initialize max_index in generate_indexes() to put force on tokens being chosen to be within the context(end index must be less than seperator token).
101
- start_index, end_index = generate_indexes(out['start_logits'][0], out['end_logits'][0], 5, 0)
102
- print(tokenizer.tokenize(text + question)[start_index:end_index+1])
103
- >>> ['اسمم', 'پدرام', '##ه', '.', 'اسمم', 'چیه', '؟']
104
- >>> ['پدرام']
105
- ```
106
-
107
- ## Acknowledgments
108
- It would be never possible to train this model without the great job done by [HooshvareLab](https://huggingface.co/HooshvareLab/bert-base-parsbert-uncased).
109
- We also express our gratitude to the [Newsha Shahbodaghkhan](https://huggingface.co/datasets/newsha/PQuAD/tree/main) for facilitating dataset gathering.
110
-
111
- ## Contributors
112
- - Pedram Yazdipoor : [Linkedin](https://www.linkedin.com/in/pedram-yazdipour/)
113
-
114
- ## Releases
115
-
116
- ### Release v0.1 (Sep 18, 2022)
117
- This is the First version of our ParsBert_For_Question_Answering_PQuAD.
 
1
+ ---
2
+ tags:
3
+ - generated_from_keras_callback
4
+ model-index:
5
+ - name: parsbert_question_answering_PQuAD
6
+ results: []
7
+ ---
8
 
9
+ <!-- This model card has been generated automatically according to the information Keras had access to. You should
10
+ probably proofread and complete it, then remove this comment. -->
11
 
12
+ # parsbert_question_answering_PQuAD
13
 
14
+ This model is a fine-tuned version of [pedramyazdipoor/parsbert_question_answering_PQuAD](https://huggingface.co/pedramyazdipoor/parsbert_question_answering_PQuAD) on an unknown dataset.
15
+ It achieves the following results on the evaluation set:
16
 
 
17
 
18
+ ## Model description
19
+
20
+ More information needed
21
+
22
+ ## Intended uses & limitations
23
+
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
+
28
+ More information needed
29
+
30
+ ## Training procedure
31
+
32
+ ### Training hyperparameters
33
+
34
+ The following hyperparameters were used during training:
35
+ - optimizer: None
36
+ - training_precision: float32
37
+
38
+ ### Training results
39
+
40
+
41
+
42
+ ### Framework versions
43
+
44
+ - Transformers 4.22.1
45
+ - TensorFlow 2.8.2
46
+ - Tokenizers 0.12.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
- "_name_or_path": "HooshvareLab/bert-base-parsbert-uncased",
3
  "architectures": [
4
- "QAModel2"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
  "classifier_dropout": null,
 
1
  {
2
+ "_name_or_path": "pedramyazdipoor/parsbert_question_answering_PQuAD",
3
  "architectures": [
4
+ "BertForQuestionAnswering"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
  "classifier_dropout": null,
tf_model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ed169e2a4dfbc1b9ba785a83ee799bf7b443c10ce6a37cf59365650471f87697
3
+ size 649278480