Microsoft reports that this model can get 84.8/82.0 on f1/em on the dev set.
I got 81.5/78.3 but I only did one run and I didn't use the official squad2 evaluation script. I will do some more runs and show the results on the official script soon.
- Downloads last month
This model can be loaded on the Inference API on-demand.