haritzpuerto's picture
Update README.md
bdf3ec5
metadata
language:
  - en
tags:
  - question-answering
  - qa
license: apache-2.0
datasets:
  - squad
metrics:
  - squad

Description

Trained on the SQuAD v1.1 dataset from the MRQA Shared Task. The public dev set was divided into two: one for dev and one for test.

Dev results:

"eval_exact_match": 88.15914715400723, "eval_f1": 93.91715796563734, "eval_samples": 5291

Test results:

"test_exact_match": 86.52455272173582, "test_f1": 92.92134442432088 "predict_samples": 5294

More info in the paper: MetaQA: Combining Expert Agents for Multi-Skill Question Answering https://arxiv.org/abs/2112.01922