metadata
tags:
- generated_from_keras_callback
- dpr
license: apache-2.0
model-index:
- name: dpr-question_encoder_bert_uncased_L-12_H-128_A-2
results: []
dpr-question_encoder_bert_uncased_L-12_H-128_A-2
This model(google/bert_uncased_L-12_H-128_A-2) was trained from scratch on training data: data.retriever.nq-adv-hn-train(facebookresearch/DPR). It achieves the following results on the evaluation set:
Evaluation data
evaluation dataset: facebook-dpr-dev-dataset from official DPR github
model_name | data_name | num of queries | num of passages | R@10 | R@20 | R@50 | R@100 | R@100 |
---|---|---|---|---|---|---|---|---|
nlpconnect/dpr-ctx_encoder_bert_uncased_L-12_H-128_A-2(our) | nq-dev dataset | 6445 | 199795 | 60.53% | 68.28% | 76.07% | 80.98% | 91.45% |
*facebook/dpr-ctx_encoder-single-nq-base(hf/fb) | nq-dev dataset | 6445 | 199795 | 40.94% | 49.27% | 59.05% | 66.00% | 82.00% |
evaluation dataset: UKPLab/beir test data but we have used first 2lac passage only.
model_name | data_name | num of queries | num of passages | R@10 | R@20 | R@50 | R@100 | R@100 |
---|---|---|---|---|---|---|---|---|
nlpconnect/dpr-ctx_encoder_bert_uncased_L-12_H-128_A-2(our) | nq-test dataset | 3452 | 200001 | 49.68% | 59.06% | 69.40% | 75.75% | 89.28% |
*facebook/dpr-ctx_encoder-single-nq-base(hf/fb) | nq-test dataset | 3452 | 200001 | 32.93% | 43.74% | 56.95% | 66.30% | 83.92% |
Note: * means we have evaluated on same eval dataset.
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: None
- training_precision: float32
Framework versions
- Transformers 4.15.0
- TensorFlow 2.7.0
- Tokenizers 0.10.3