haritzpuerto's picture
Update README.md
bdf3ec5
---
language:
- en
tags:
- question-answering
- qa
license: "apache-2.0"
datasets:
- squad
metrics:
- squad
---
# Description
Trained on the SQuAD v1.1 dataset from the MRQA Shared Task. The public dev set was divided into two: one for dev and one for test.
# Dev results:
"eval_exact_match": 88.15914715400723,
"eval_f1": 93.91715796563734,
"eval_samples": 5291
# Test results:
"test_exact_match": 86.52455272173582,
"test_f1": 92.92134442432088
"predict_samples": 5294
More info in the paper:
**MetaQA: Combining Expert Agents for Multi-Skill Question Answering**
https://arxiv.org/abs/2112.01922