madlag commited on
Commit
c7eb664
1 Parent(s): c8c970e

Adding model card

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ thumbnail:
4
+ license: mit
5
+ tags:
6
+ - question-answering
7
+ - bert
8
+ - bert-base
9
+ datasets:
10
+ - squad
11
+ metrics:
12
+ - squad
13
+ widget:
14
+ - text: "Where is located the Eiffel Tower ?"
15
+ context: "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower."
16
+ - text: "Who is Frederic Chopin?"
17
+ context: "Frédéric François Chopin, born Fryderyk Franciszek Chopin (1 March 1810 – 17 October 1849), was a Polish composer and virtuoso pianist of the Romantic era who wrote primarily for solo piano."
18
+ ---
19
+
20
+ ## BERT-base uncased model fine-tuned on SQuAD v1
21
+
22
+ This model is [block-sparse](https://github.com/huggingface/pytorch_block_sparse).
23
+
24
+ That means that with the right runtime it can run roughly 3x faster than an dense network, with 25% of the original weights.
25
+
26
+ This of course has some impact on the accuracy (see below).
27
+
28
+ It uses a modified version of Victor Sanh [Movement Pruning](https://arxiv.org/abs/2005.07683) method.
29
+
30
+ This model was fine-tuned from the HuggingFace [BERT](https://www.aclweb.org/anthology/N19-1423/) base uncased checkpoint on [SQuAD1.1](https://rajpurkar.github.io/SQuAD-explorer).
31
+ This model is case-insensitive: it does not make a difference between english and English.
32
+
33
+ ## Details
34
+
35
+ | Dataset | Split | # samples |
36
+ | -------- | ----- | --------- |
37
+ | SQuAD1.1 | train | 90.6K |
38
+ | SQuAD1.1 | eval | 11.1k |
39
+
40
+
41
+ ### Fine-tuning
42
+ - Python: `3.8.5`
43
+
44
+ - Machine specs:
45
+
46
+ `CPU: Intel(R) Core(TM) i7-6700K CPU`
47
+
48
+ `Memory: 64 GiB`
49
+
50
+ `GPUs: 1 GeForce GTX 3090, with 24GiB memory`
51
+
52
+ `GPU driver: 455.23.05, CUDA: 11.1`
53
+
54
+
55
+ ### Results
56
+
57
+ **Model size**: `418M`
58
+
59
+ | Metric | # Value | # Original ([Table 2](https://www.aclweb.org/anthology/N19-1423.pdf))|
60
+ | ------ | --------- | --------- |
61
+ | **EM** | **74.82** | **80.8** |
62
+ | **F1** | **83.7** | **88.5** |
63
+
64
+ Note that the above results didn't involve any hyperparameter search.
65
+
66
+ ## Example Usage
67
+
68
+ ```python
69
+ from transformers import pipeline
70
+
71
+ qa_pipeline = pipeline(
72
+ "question-answering",
73
+ model="madlag/bert-base-uncased-squad-v1-sparse0.25",
74
+ tokenizer="madlag/bert-base-uncased-squad-v1-sparse0.25"
75
+ )
76
+
77
+ predictions = qa_pipeline({
78
+ 'context': "Frédéric François Chopin, born Fryderyk Franciszek Chopin (1 March 1810 – 17 October 1849), was a Polish composer and virtuoso pianist of the Romantic era who wrote primarily for solo piano.",
79
+ 'question': "Who is Frederic Chopin?",
80
+ })
81
+
82
+ print(predictions)