1 # DPRQuestionEncoder for TriviaQA
2
3 ## dpr-question_encoder-single-trivia-base
4
5 Dense Passage Retrieval (`DPR`)
6
7 Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih, [Dense Passage Retrieval for Open-Domain Question Answering](https://arxiv.org/abs/2004.04906), EMNLP 2020.
8
9 This model is the question encoder of DPR trained solely on TriviaQA (single-trivia) using the [official implementation of DPR](https://github.com/facebookresearch/DPR).
10
11 Disclaimer: This model is not from the authors of DPR, but my reproduction. The authors did not release the DPR weights trained solely on TriviaQA. I hope this model checkpoint can be helpful for those who want to use DPR trained only on TriviaQA.
12
13 ## Performance
14
15 The following is the answer recall rate measured using PyTorch 1.4.0 and transformers 4.5.0.
16
17 The values in parentheses are those reported in the paper.
18
19 | Top-K Passages | TriviaQA Dev | TriviaQA Test |
20 |----------------|--------------|---------------|
21 | 1 | 54.27 | 54.41 |
22 | 5 | 71.11 | 70.99 |
23 | 20 | 79.53 | 79.31 (79.4) |
24 | 50 | 82.72 | 82.99 |
25 | 100 | 85.07 | 84.99 (85.0) |
26
27 ## How to Use
28
29 Using `AutoModel` does not properly detect whether the checkpoint is for `DPRContextEncoder` or `DPRQuestionEncoder`.
30
31 Therefore, please specify the exact class to use the model.
32
33 ```python
34 from transformers import DPRQuestionEncoder, AutoTokenizer
35
36 tokenizer = AutoTokenizer.from_pretrained("soheeyang/dpr-question_encoder-single-trivia-base")
37 question_encoder = DPRQuestionEncoder.from_pretrained("soheeyang/dpr-question_encoder-single-trivia-base")
38
39 data = tokenizer("question comes here", return_tensors="pt")
40 question_embedding = question_encoder(**data).pooler_output # embedding vector for question
41 ```
42