FiD model trained on TQA

-- This is the model checkpoint of FiD [2], based on the T5 large (with 770M parameters) and trained on the TriviaQA dataset [1].

-- Hyperparameters: 8 x 40GB A100 GPUs; batch size 8; AdamW; LR 3e-5; 30000 steps

References:

[1] TriviaQA: A Large Scale Dataset for Reading Comprehension and Question Answering. ACL 2017

[2] Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. EACL 2021.

Model performance

We evaluate it on the TriviaQA dataset, the EM score is 68.5 (0.8 higher than the original performance reported in the paper).

--- license: cc-by-4.0 ---