pedramyazdipoor
/

persian_xlm_roberta_large

Question Answering

Transformers

PyTorch

xlm-roberta

Inference Endpoints

Model card Files Files and versions Community

pedramyazdipoor commited on Sep 19, 2022

Commit

67ff02c

•

1 Parent(s): ed4a795

Update README.md

Browse files

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ This model is fine-tuned on PQuAD Train set and is easily ready to use.
 Its very long training time encouraged me to publish this model in order to make life easier for those who need.
-## Hyperparameters
 I set batch size to 4 due to the limitations of GPU memory in Google Colab.
 ```
 batch_size = 4
@@ -58,7 +58,7 @@ There are some considerations for inference:
 3) The selected span must be the most probable choice among N pairs of candidates.
 ```python
-def generate_indexes(start_logits, end_logits, N, max_index):
   output_start = start_logits
   output_end = end_logits
@@ -79,7 +79,7 @@ def generate_indexes(start_logits, end_logits, N, max_index):
   for a in range(0,N):
     for b in range(0,N):
       if (sorted_start_list[a][1] + sorted_end_list[b][1]) > prob :
-        if (sorted_start_list[a][0] <= sorted_end_list[b][0]) and (sorted_end_list[a][0] < max_index) :
           prob = sorted_start_list[a][1] + sorted_end_list[b][1]
           start_idx = sorted_start_list[a][0]
           end_idx = sorted_end_list[b][0]
@@ -94,7 +94,7 @@ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
 model.eval().to(device)
 text = 'سلام من پدرامم 26 سالمه'
 question = 'چند سالمه؟'
-encoding = tokenizer(text,question,add_special_tokens = True,
                      return_token_type_ids = True,
                      return_tensors = 'pt',
                      padding = True,
@@ -104,7 +104,7 @@ encoding = tokenizer(text,question,add_special_tokens = True,
 out = model(encoding['input_ids'].to(device),encoding['attention_mask'].to(device), encoding['token_type_ids'].to(device))
 #we had to change some pieces of code to make it compatible with one answer generation at a time
 #If you have unanswerable questions, use out['start_logits'][0][0:] and out['end_logits'][0][0:] because <s> (the 1st token) is for this situation and must be compared with other tokens.
-#you can initialize max_index in generate_indexes() to put force on tokens being chosen to be within the context(end index must be less than seperator token).
 answer_start_index, answer_end_index = generate_indexes(out['start_logits'][0][1:], out['end_logits'][0][1:], 5, 0)
 print(tokenizer.tokenize(text + question))
 print(tokenizer.tokenize(text + question)[answer_start_index : (answer_end_index + 1)])

 Its very long training time encouraged me to publish this model in order to make life easier for those who need.
+## Hyperparameters of training
 I set batch size to 4 due to the limitations of GPU memory in Google Colab.
 ```
 batch_size = 4
 3) The selected span must be the most probable choice among N pairs of candidates.
 ```python
+def generate_indexes(start_logits, end_logits, N, min_index):
   output_start = start_logits
   output_end = end_logits
   for a in range(0,N):
     for b in range(0,N):
       if (sorted_start_list[a][1] + sorted_end_list[b][1]) > prob :
+        if (sorted_start_list[a][0] <= sorted_end_list[b][0]) and (sorted_start_list[a][0] > min_index) :
           prob = sorted_start_list[a][1] + sorted_end_list[b][1]
           start_idx = sorted_start_list[a][0]
           end_idx = sorted_end_list[b][0]
 model.eval().to(device)
 text = 'سلام من پدرامم 26 سالمه'
 question = 'چند سالمه؟'
+encoding = tokenizer(question,text,add_special_tokens = True,
                      return_token_type_ids = True,
                      return_tensors = 'pt',
                      padding = True,
 out = model(encoding['input_ids'].to(device),encoding['attention_mask'].to(device), encoding['token_type_ids'].to(device))
 #we had to change some pieces of code to make it compatible with one answer generation at a time
 #If you have unanswerable questions, use out['start_logits'][0][0:] and out['end_logits'][0][0:] because <s> (the 1st token) is for this situation and must be compared with other tokens.
+#you can initialize min_index in generate_indexes() to put force on tokens being chosen to be within the context(startindex must be greater than seperator token).
 answer_start_index, answer_end_index = generate_indexes(out['start_logits'][0][1:], out['end_logits'][0][1:], 5, 0)
 print(tokenizer.tokenize(text + question))
 print(tokenizer.tokenize(text + question)[answer_start_index : (answer_end_index + 1)])