pedramyazdipoor commited on
Commit
67ff02c
1 Parent(s): ed4a795

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -14,7 +14,7 @@ This model is fine-tuned on PQuAD Train set and is easily ready to use.
14
  Its very long training time encouraged me to publish this model in order to make life easier for those who need.
15
 
16
 
17
- ## Hyperparameters
18
  I set batch size to 4 due to the limitations of GPU memory in Google Colab.
19
  ```
20
  batch_size = 4
@@ -58,7 +58,7 @@ There are some considerations for inference:
58
  3) The selected span must be the most probable choice among N pairs of candidates.
59
 
60
  ```python
61
- def generate_indexes(start_logits, end_logits, N, max_index):
62
 
63
  output_start = start_logits
64
  output_end = end_logits
@@ -79,7 +79,7 @@ def generate_indexes(start_logits, end_logits, N, max_index):
79
  for a in range(0,N):
80
  for b in range(0,N):
81
  if (sorted_start_list[a][1] + sorted_end_list[b][1]) > prob :
82
- if (sorted_start_list[a][0] <= sorted_end_list[b][0]) and (sorted_end_list[a][0] < max_index) :
83
  prob = sorted_start_list[a][1] + sorted_end_list[b][1]
84
  start_idx = sorted_start_list[a][0]
85
  end_idx = sorted_end_list[b][0]
@@ -94,7 +94,7 @@ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
94
  model.eval().to(device)
95
  text = 'سلام من پدرامم 26 سالمه'
96
  question = 'چند سالمه؟'
97
- encoding = tokenizer(text,question,add_special_tokens = True,
98
  return_token_type_ids = True,
99
  return_tensors = 'pt',
100
  padding = True,
@@ -104,7 +104,7 @@ encoding = tokenizer(text,question,add_special_tokens = True,
104
  out = model(encoding['input_ids'].to(device),encoding['attention_mask'].to(device), encoding['token_type_ids'].to(device))
105
  #we had to change some pieces of code to make it compatible with one answer generation at a time
106
  #If you have unanswerable questions, use out['start_logits'][0][0:] and out['end_logits'][0][0:] because <s> (the 1st token) is for this situation and must be compared with other tokens.
107
- #you can initialize max_index in generate_indexes() to put force on tokens being chosen to be within the context(end index must be less than seperator token).
108
  answer_start_index, answer_end_index = generate_indexes(out['start_logits'][0][1:], out['end_logits'][0][1:], 5, 0)
109
  print(tokenizer.tokenize(text + question))
110
  print(tokenizer.tokenize(text + question)[answer_start_index : (answer_end_index + 1)])
 
14
  Its very long training time encouraged me to publish this model in order to make life easier for those who need.
15
 
16
 
17
+ ## Hyperparameters of training
18
  I set batch size to 4 due to the limitations of GPU memory in Google Colab.
19
  ```
20
  batch_size = 4
 
58
  3) The selected span must be the most probable choice among N pairs of candidates.
59
 
60
  ```python
61
+ def generate_indexes(start_logits, end_logits, N, min_index):
62
 
63
  output_start = start_logits
64
  output_end = end_logits
 
79
  for a in range(0,N):
80
  for b in range(0,N):
81
  if (sorted_start_list[a][1] + sorted_end_list[b][1]) > prob :
82
+ if (sorted_start_list[a][0] <= sorted_end_list[b][0]) and (sorted_start_list[a][0] > min_index) :
83
  prob = sorted_start_list[a][1] + sorted_end_list[b][1]
84
  start_idx = sorted_start_list[a][0]
85
  end_idx = sorted_end_list[b][0]
 
94
  model.eval().to(device)
95
  text = 'سلام من پدرامم 26 سالمه'
96
  question = 'چند سالمه؟'
97
+ encoding = tokenizer(question,text,add_special_tokens = True,
98
  return_token_type_ids = True,
99
  return_tensors = 'pt',
100
  padding = True,
 
104
  out = model(encoding['input_ids'].to(device),encoding['attention_mask'].to(device), encoding['token_type_ids'].to(device))
105
  #we had to change some pieces of code to make it compatible with one answer generation at a time
106
  #If you have unanswerable questions, use out['start_logits'][0][0:] and out['end_logits'][0][0:] because <s> (the 1st token) is for this situation and must be compared with other tokens.
107
+ #you can initialize min_index in generate_indexes() to put force on tokens being chosen to be within the context(startindex must be greater than seperator token).
108
  answer_start_index, answer_end_index = generate_indexes(out['start_logits'][0][1:], out['end_logits'][0][1:], 5, 0)
109
  print(tokenizer.tokenize(text + question))
110
  print(tokenizer.tokenize(text + question)[answer_start_index : (answer_end_index + 1)])