nguyenvulebinh commited on
Commit
732c309
1 Parent(s): e89f4c3
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -45,6 +45,7 @@ Public leaderboard | Private leaderboard
45
  [MRCQuestionAnswering](https://github.com/nguyenvulebinh/extractive-qa-mrc) using [XLM-RoBERTa](https://huggingface.co/transformers/model_doc/xlmroberta.html) as a pre-trained language model. By default, XLM-RoBERTa will split word in to sub-words. But in my implementation, I re-combine sub-words representation (after encoded by BERT layer) into word representation using sum strategy.
46
 
47
  ## Using pre-trained model
 
48
 
49
  - Hugging Face pipeline style (**NOT using sum features strategy**).
50
 
@@ -70,8 +71,8 @@ from infer import tokenize_function, data_collator, extract_answer
70
  from model.mrc_model import MRCQuestionAnswering
71
  from transformers import AutoTokenizer
72
 
73
- # model_checkpoint = "nguyenvulebinh/vi-mrc-large"
74
- model_checkpoint = "nguyenvulebinh/vi-mrc-base"
75
  tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
76
  model = MRCQuestionAnswering.from_pretrained(model_checkpoint)
77
 
45
  [MRCQuestionAnswering](https://github.com/nguyenvulebinh/extractive-qa-mrc) using [XLM-RoBERTa](https://huggingface.co/transformers/model_doc/xlmroberta.html) as a pre-trained language model. By default, XLM-RoBERTa will split word in to sub-words. But in my implementation, I re-combine sub-words representation (after encoded by BERT layer) into word representation using sum strategy.
46
 
47
  ## Using pre-trained model
48
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Yqgdfaca7L94OyQVnq5iQq8wRTFvVZjv?usp=sharing)
49
 
50
  - Hugging Face pipeline style (**NOT using sum features strategy**).
51
 
71
  from model.mrc_model import MRCQuestionAnswering
72
  from transformers import AutoTokenizer
73
 
74
+ model_checkpoint = "nguyenvulebinh/vi-mrc-large"
75
+ #model_checkpoint = "nguyenvulebinh/vi-mrc-base"
76
  tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
77
  model = MRCQuestionAnswering.from_pretrained(model_checkpoint)
78