soheeyang commited on
Commit
d77ce80
1 Parent(s): 43ba9c8

Update readme

Browse files
Files changed (1) hide show
  1. README.md +9 -10
README.md CHANGED
@@ -12,15 +12,14 @@ This model is the question encoder of RDR trained solely on Natural Questions (N
12
 
13
  The following is the answer recall rate measured using PyTorch 1.4.0 and transformers 4.5.0.
14
 
15
- The values of DPR on the NQ dev set are taken from Table 1 of the paper of [RDR](https://arxiv.org/abs/2010.10999). The values of DPR on the NQ test set are taken from the [codebase of DPR](https://github.com/facebookresearch/DPR). DPR-adv is the a new DPR model released in March 2021. It is trained on the original DPR NQ train set and its version where hard negatives are mined using DPR index itself using the previous NQ checkpoint. Please refer to the [codebase of DPR](https://github.com/facebookresearch/DPR) for more details about DPR-adv-hn.
16
 
17
- | | Top-K Passages | 1 | 5 | 20 | 50 | 100 |
18
- |---------|------------------|-------|-------|-------|-------|-------|
19
- | **NQ Dev** | **DPR (approx)** | 44.2 | - | 76.9 | 81.3 | 84.2 |
20
- | | **RDR (This Model)** | **54.43** | **72.17** | **81.33** | **84.8** | **86.61** |
21
- | **NQ Test** | **DPR** | 45.87 | 68.14 | 79.97 | - | 85.87 |
22
- | | **DPR-adv-hn** | 52.47 | **72.24** | 81.33 | - | 87.29 |
23
- | | **RDR (This Model)** | **54.29** | 72.16 | **82.8** | **86.34** | **88.2** |
24
 
25
  ## How to Use
26
 
@@ -33,8 +32,8 @@ Therefore, please specify the exact class to use the model.
33
  ```python
34
  from transformers import DPRQuestionEncoder, AutoTokenizer
35
 
36
- tokenizer = AutoTokenizer.from_pretrained("soheeyang/rdr-question_encoder-single-nq-base")
37
- question_encoder = DPRQuestionEncoder.from_pretrained("soheeyang/rdr-question_encoder-single-nq-base")
38
 
39
  data = tokenizer("question comes here", return_tensors="pt")
40
  question_embedding = question_encoder(**data).pooler_output # embedding vector for question
 
12
 
13
  The following is the answer recall rate measured using PyTorch 1.4.0 and transformers 4.5.0.
14
 
15
+ For the values of DPR, those in parentheses are directly taken from the paper. The values without parentheses are reported using the reproduction of DPR that consists of [this question encoder](https://huggingface.co/soheeyang/dpr-question_encoder-single-trivia-base) and [this queston encoder](https://huggingface.co/soheeyang/dpr-question_encoder-single-trivia-base).
16
 
17
+ | | Top-K Passages | 1 | 5 | 20 | 50 | 100 |
18
+ |-------------|------------------|-----------|-----------|-----------|-----------|-----------|
19
+ |**TriviaQA Dev** | **DPR** | 54.27 | 71.11 | 79.53 | 82.72 | 85.07 |
20
+ | | **RDR (This Model)** | **61.84** | **75.93** | **82.56** | **85.35** | **87.00** |
21
+ |**TriviaQA Test**| **DPR** | 54.41 | 70.99 | 79.31 (79.4) | 82.90 | 84.99 (85.0) |
22
+ | | **RDR (This Model)** | **62.56** | **75.92** | **82.52** | **85.64** | **87.26** |
 
23
 
24
  ## How to Use
25
 
 
32
  ```python
33
  from transformers import DPRQuestionEncoder, AutoTokenizer
34
 
35
+ tokenizer = AutoTokenizer.from_pretrained("soheeyang/rdr-question_encoder-single-trivia-base")
36
+ question_encoder = DPRQuestionEncoder.from_pretrained("soheeyang/rdr-question_encoder-single-trivia-base")
37
 
38
  data = tokenizer("question comes here", return_tensors="pt")
39
  question_embedding = question_encoder(**data).pooler_output # embedding vector for question