Migrate model card from transformers-repo
Browse filesRead announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/valhalla/electra-base-discriminator-finetuned_squadv1/README.md
README.md
ADDED
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# ELECTRA-BASE-DISCRIMINATOR finetuned on SQuADv1
|
2 |
+
|
3 |
+
This is electra-base-discriminator model finetuned on SQuADv1 dataset for for question answering task.
|
4 |
+
|
5 |
+
## Model details
|
6 |
+
As mentioned in the original paper: ELECTRA is a new method for self-supervised language representation learning.
|
7 |
+
It can be used to pre-train transformer networks using relatively little compute.
|
8 |
+
ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network,
|
9 |
+
similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU.
|
10 |
+
At large scale, ELECTRA achieves state-of-the-art results on the SQuAD 2.0 dataset.
|
11 |
+
|
12 |
+
| Param | #Value |
|
13 |
+
|---------------------|--------|
|
14 |
+
| layers | 12 |
|
15 |
+
| hidden size | 768 |
|
16 |
+
| num attetion heads | 12 |
|
17 |
+
| on disk size | 436MB |
|
18 |
+
|
19 |
+
## Model training
|
20 |
+
This model was trained on google colab v100 GPU.
|
21 |
+
You can find the fine-tuning colab here
|
22 |
+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11yo-LaFsgggwmDSy2P8zD3tzf5cCb-DU?usp=sharing).
|
23 |
+
|
24 |
+
## Results
|
25 |
+
The results are actually slightly better than given in the paper.
|
26 |
+
In the paper the authors mentioned that electra-base achieves 84.5 EM and 90.8 F1
|
27 |
+
|
28 |
+
| Metric | #Value |
|
29 |
+
|--------|--------|
|
30 |
+
| EM | 85.0520|
|
31 |
+
| F1 | 91.6050|
|
32 |
+
|
33 |
+
|
34 |
+
## Model in Action 🚀
|
35 |
+
```python3
|
36 |
+
from transformers import pipeline
|
37 |
+
|
38 |
+
nlp = pipeline('question-answering', model='valhalla/electra-base-discriminator-finetuned_squadv1')
|
39 |
+
nlp({
|
40 |
+
'question': 'What is the answer to everything ?',
|
41 |
+
'context': '42 is the answer to life the universe and everything'
|
42 |
+
})
|
43 |
+
=> {'answer': '42', 'end': 2, 'score': 0.981274963050339, 'start': 0}
|
44 |
+
```
|
45 |
+
|
46 |
+
> Created with ❤️ by Suraj Patil [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/patil-suraj/)
|
47 |
+
[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/psuraj28)
|