nbroad's picture
nbroad HF staff
Add verifyToken field to verify evaluation results are produced by Hugging Face's automatic model evaluator (#3)
89a25d0
---
language: en
license: cc-by-4.0
tags:
- question-answering
datasets:
- squad_v2
metrics:
- f1
- exact
widget:
- context: DeBERTa improves the BERT and RoBERTa models using disentangled attention
and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa
on a majority of NLU tasks with 80GB training data. In DeBERTa V3, we further
improved the efficiency of DeBERTa using ELECTRA-Style pre-training with Gradient
Disentangled Embedding Sharing. Compared to DeBERTa, our V3 version significantly
improves the model performance on downstream tasks. You can find more technique
details about the new model from our paper. Please check the official repository
for more implementation details and updates.
example_title: DeBERTa v3 Q1
text: How is DeBERTa version 3 different than previous ones?
- context: DeBERTa improves the BERT and RoBERTa models using disentangled attention
and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa
on a majority of NLU tasks with 80GB training data. In DeBERTa V3, we further
improved the efficiency of DeBERTa using ELECTRA-Style pre-training with Gradient
Disentangled Embedding Sharing. Compared to DeBERTa, our V3 version significantly
improves the model performance on downstream tasks. You can find more technique
details about the new model from our paper. Please check the official repository
for more implementation details and updates.
example_title: DeBERTa v3 Q2
text: Where do I go to see new info about DeBERTa?
model-index:
- name: DeBERTa v3 xsmall squad2
results:
- task:
type: question-answering
name: Question Answering
dataset:
name: SQuAD2.0
type: question-answering
metrics:
- type: f1
value: 81.5
name: f1
- type: exact
value: 78.3
name: exact
- task:
type: question-answering
name: Question Answering
dataset:
name: squad_v2
type: squad_v2
config: squad_v2
split: validation
metrics:
- type: exact_match
value: 78.5341
name: Exact Match
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTk0ZGQ1YjU1YmQ5NTc2M2RmNjg2OGViYjcyODZkOTc1MDBkNmI5MDc0MzEyMzZmNDg3Yzc4ZTA3ZjAwM2M5ZiIsInZlcnNpb24iOjF9.ewKF-UetUoxKDeXgnM6vqy8nBC9c3qh7dLZhdQlgSxPut3LjAhpCh2fJGir-OVcfzWzxsPhcZQEpdnxR8oZnAA
- type: f1
value: 81.6408
name: F1
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTQwZDdjY2ZlOGVhM2E5NGM3OGNkNTk2NWFkYTg1Y2Q0YWFlYWJmMGIyZWM5ZjMyYTYyODUzMDA0NWU0ZGVkZCIsInZlcnNpb24iOjF9.BHJNhS1YisUIkjcpIMdwXurTewak9dkkpGXC2vHvUB4qUEuk_p3V-orhmeFyTxzLaWRwrZVGVz-NSfqFr4n1Ag
- type: total
value: 11870
name: total
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNzNiZDQ3MDAyNzljMDI4NTRlYzZiZjE4ODJhZDhmZWE2ZjcwNjg2ZWJmNjUyMTUzZDk4ODNjNDExYTk1YWNlOCIsInZlcnNpb24iOjF9.3BlfmMvbV86Ua39ToqnMmgpGS0ZTew0UFFYWGyTkS3u7jaAXCfYkFkNJXw806f2uFFkKr1hqlzzKfivV0wUjCg
- task:
type: question-answering
name: Question Answering
dataset:
name: squad
type: squad
config: plain_text
split: validation
metrics:
- type: exact_match
value: 84.1741
name: Exact Match
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYTA0MDVlYWI5NzdiNjllM2NmZTYwYmQ5YzE0ODgwOTA3MWZjZDkxNDFmZDM1OTQzMzgwNWI4NDc5NThhM2VhZSIsInZlcnNpb24iOjF9.lc2nUBxSu2_0_a5lyVsV51UAmkE8WHDTwGHvt3n9zvCbcJ1ylOg2xovF0_j0hZS16lv1DEw5XV8EW_ZS7mfvBg
- type: f1
value: 91.0771
name: F1
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODQxMjkxOWJlZTc2MmE5YzVmMjNhOTkwNDdiMDBhNWUwMDU3MDI1MmJiNDY4MjczYjIwM2U1NDhlYmZlZWQwMSIsInZlcnNpb24iOjF9.x_axHiBX5d3UIi1UbJT3kVbdX4kX9XFLQSg-l16-AAK9tiyutT-yaYJOi8LSb2lR4677tJpf3itu4eriJRU2Cg
---
# DeBERTa v3 xsmall SQuAD 2.0
[Microsoft reports that this model can get 84.8/82.0](https://huggingface.co/microsoft/deberta-v3-xsmall#fine-tuning-on-nlu-tasks) on f1/em on the dev set.
I got 81.5/78.3 but I only did one run and I didn't use the official squad2 evaluation script. I will do some more runs and show the results on the official script soon.