NetsPresso_QA / pyserini /resources /index-metadata /faiss-flat.wikipedia.dkrr-dpr-nq-retriever.20220217.25ed1f.cc91b2.README.md
geonmin-kim's picture
Upload folder using huggingface_hub
d6585f5

A newer version of the Gradio SDK is available: 4.37.2

Upgrade

wikipedia-dpr-dkrr-nq

Faiss FlatIP index of Wikipedia DPR encoded by the retriever model from Distilling Knowledge from Reader to Retriever for Question Answering trained on NQ. This index was generated on 2022/02/17 on orca at commits:

with the following command to generate the embeddings (from FiD repo):

python generate_passage_embeddings.py \
  --model_path nq_retriever \
  --passages passages.tsv \
  --output_path wikipedia_embeddings_nq \
  --shard_id 0 \
  --num_shards 1 \
  --per_gpu_batch_size 500

and the following command to convert the embeddings to faiss IndexFlatIP form:

python convert_dkrr_embeddings_to_faiss.py \
  --embeddings wikipedia_embeddings_nq \
  --output faiss-flat.wikipedia.dkrr-dpr-nq-retriever