harshil10 commited on
Commit
293bb9b
1 Parent(s): 3586dd9

Upload 13 files

Browse files
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ thumbnail:
4
+ tags:
5
+ - QA
6
+
7
+ ---
8
+
9
+ # BERT-Tiny fine-tuned on SQuAD v2
10
+
11
+ [BERT-Tiny](https://github.com/google-research/bert/) created by [Google Research](https://github.com/google-research) and fine-tuned on [SQuAD 2.0](https://rajpurkar.github.io/SQuAD-explorer/) for **Q&A** downstream task.
12
+
13
+ **Mode size** (after training): **16.74 MB**
14
+
15
+ ## Details of BERT-Tiny and its 'family' (from their documentation)
16
+
17
+ Released on March 11th, 2020
18
+
19
+ This is model is a part of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) referenced in [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962).
20
+
21
+ The smaller BERT models are intended for environments with restricted computational resources. They can be fine-tuned in the same manner as the original BERT models. However, they are most effective in the context of knowledge distillation, where the fine-tuning labels are produced by a larger and more accurate teacher.
22
+
23
+ ## Details of the downstream task (Q&A) - Dataset
24
+
25
+ [SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer/) combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.
26
+
27
+ | Dataset | Split | # samples |
28
+ | -------- | ----- | --------- |
29
+ | SQuAD2.0 | train | 130k |
30
+ | SQuAD2.0 | eval | 12.3k |
31
+
32
+ ## Model training
33
+
34
+ The model was trained on a Tesla P100 GPU and 25GB of RAM.
35
+ The script for fine tuning can be found [here](https://github.com/huggingface/transformers/tree/main/examples/legacy/question-answering)
36
+
37
+ ## Results:
38
+
39
+ | Metric | # Value |
40
+ | ------ | --------- |
41
+ | **EM** | **48.60** |
42
+ | **F1** | **49.73** |
43
+
44
+
45
+ | Model | EM | F1 score | SIZE (MB) |
46
+ | ----------------------------------------------------------------------------------------- | --------- | --------- | --------- |
47
+ | [bert-tiny-finetuned-squadv2](https://huggingface.co/mrm8488/bert-tiny-finetuned-squadv2) | 48.60 | 49.73 | **16.74** |
48
+ | [bert-tiny-5-finetuned-squadv2](https://huggingface.co/mrm8488/bert-tiny-5-finetuned-squadv2) | **57.12** | **60.86** | 24.34
49
+
50
+ ## Model in action
51
+
52
+ Fast usage with **pipelines**:
53
+
54
+ ```python
55
+ from transformers import pipeline
56
+
57
+ qa_pipeline = pipeline(
58
+ "question-answering",
59
+ model="mrm8488/bert-tiny-finetuned-squadv2",
60
+ tokenizer="mrm8488/bert-tiny-finetuned-squadv2"
61
+ )
62
+
63
+ qa_pipeline({
64
+ 'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
65
+ 'question': "Who has been working hard for hugginface/transformers lately?"
66
+
67
+ })
68
+
69
+ # Output:
70
+ ```
71
+
72
+ ```json
73
+ {
74
+ "answer": "Manuel Romero",
75
+ "end": 13,
76
+ "score": 0.05684709993458714,
77
+ "start": 0
78
+ }
79
+ ```
80
+
81
+ ### Yes! That was easy 🎉 Let's try with another example
82
+
83
+ ```python
84
+ qa_pipeline({
85
+ 'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
86
+ 'question': "For which company has worked Manuel Romero?"
87
+ })
88
+
89
+ # Output:
90
+ ```
91
+
92
+ ```json
93
+ {
94
+ "answer": "hugginface/transformers",
95
+ "end": 79,
96
+ "score": 0.11613431826808274,
97
+ "start": 56
98
+ }
99
+ ```
100
+
101
+ ### It works!! 🎉 🎉 🎉
102
+
103
+ > Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
104
+
105
+ > Made with <span style="color: #e25555;">&hearts;</span> in Spain
config.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForQuestionAnswering"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "hidden_act": "gelu",
7
+ "hidden_dropout_prob": 0.1,
8
+ "hidden_size": 128,
9
+ "initializer_range": 0.02,
10
+ "intermediate_size": 512,
11
+ "layer_norm_eps": 1e-12,
12
+ "max_position_embeddings": 512,
13
+ "model_type": "bert",
14
+ "num_attention_heads": 2,
15
+ "num_hidden_layers": 2,
16
+ "output_past": true,
17
+ "pad_token_id": 0,
18
+ "type_vocab_size": 2,
19
+ "vocab_size": 30522
20
+ }
flax_model.msgpack ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ed0e94d84dbc40d763d5fff5b11ce849426755c312978e4431586b6ecd5dde7
3
+ size 17480080
gitattributes.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ *.bin.* filter=lfs diff=lfs merge=lfs -text
2
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.h5 filter=lfs diff=lfs merge=lfs -text
5
+ *.tflite filter=lfs diff=lfs merge=lfs -text
6
+ *.tar.gz filter=lfs diff=lfs merge=lfs -text
7
+ *.ot filter=lfs diff=lfs merge=lfs -text
8
+ *.onnx filter=lfs diff=lfs merge=lfs -text
9
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
10
+ model.safetensors filter=lfs diff=lfs merge=lfs -text
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85de102568316fe22c76f686893a1cca0df6fee00f432fd0f5600b649bce8649
3
+ size 17549310
model.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73a283022b2553b466662c883cc362832862f69c5be8bee62ae9c629ea66c6eb
3
+ size 16258966
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b0e28d9e21beb1df101fd56b6290f4fe7ad70a5daf101251d594efe3bd80752
3
+ size 17553971
pytorch_model.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bcc403a1c3fd5a8e2093dff046f6f35ba7a454ce5bf1b8e1ad8d0f697e0c6bd8
3
+ size 16260877
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"do_lower_case": false}
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c26310a645b2de935c72b13d46fc84f6ad0ac3dff8f756e1202d86a0b4857e2
3
+ size 1491
training_args.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:10bd9a1d54114ef0ffed0f4fce87e26e2893d94deb467851675ad19ea211ac7f
3
+ size 1073
vocab.txt ADDED
The diff for this file is too large to render. See raw diff