madlag commited on
Commit
03b3ede
1 Parent(s): 4e8581e

Adding modes, graphs and metadata.

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. README.md +97 -0
  2. config.json +113 -0
  3. model_card/density_info.js +174 -0
  4. model_card/images/layer_0_attention_output_dense.png +0 -0
  5. model_card/images/layer_0_attention_self_key.png +0 -0
  6. model_card/images/layer_0_attention_self_query.png +0 -0
  7. model_card/images/layer_0_attention_self_value.png +0 -0
  8. model_card/images/layer_0_intermediate_dense.png +0 -0
  9. model_card/images/layer_0_output_dense.png +0 -0
  10. model_card/images/layer_10_attention_output_dense.png +0 -0
  11. model_card/images/layer_10_attention_self_key.png +0 -0
  12. model_card/images/layer_10_attention_self_query.png +0 -0
  13. model_card/images/layer_10_attention_self_value.png +0 -0
  14. model_card/images/layer_10_intermediate_dense.png +0 -0
  15. model_card/images/layer_10_output_dense.png +0 -0
  16. model_card/images/layer_11_attention_output_dense.png +0 -0
  17. model_card/images/layer_11_attention_self_key.png +0 -0
  18. model_card/images/layer_11_attention_self_query.png +0 -0
  19. model_card/images/layer_11_attention_self_value.png +0 -0
  20. model_card/images/layer_11_intermediate_dense.png +0 -0
  21. model_card/images/layer_11_output_dense.png +0 -0
  22. model_card/images/layer_1_attention_output_dense.png +0 -0
  23. model_card/images/layer_1_attention_self_key.png +0 -0
  24. model_card/images/layer_1_attention_self_query.png +0 -0
  25. model_card/images/layer_1_attention_self_value.png +0 -0
  26. model_card/images/layer_1_intermediate_dense.png +0 -0
  27. model_card/images/layer_1_output_dense.png +0 -0
  28. model_card/images/layer_2_attention_output_dense.png +0 -0
  29. model_card/images/layer_2_attention_self_key.png +0 -0
  30. model_card/images/layer_2_attention_self_query.png +0 -0
  31. model_card/images/layer_2_attention_self_value.png +0 -0
  32. model_card/images/layer_2_intermediate_dense.png +0 -0
  33. model_card/images/layer_2_output_dense.png +0 -0
  34. model_card/images/layer_3_attention_output_dense.png +0 -0
  35. model_card/images/layer_3_attention_self_key.png +0 -0
  36. model_card/images/layer_3_attention_self_query.png +0 -0
  37. model_card/images/layer_3_attention_self_value.png +0 -0
  38. model_card/images/layer_3_intermediate_dense.png +0 -0
  39. model_card/images/layer_3_output_dense.png +0 -0
  40. model_card/images/layer_4_attention_output_dense.png +0 -0
  41. model_card/images/layer_4_attention_self_key.png +0 -0
  42. model_card/images/layer_4_attention_self_query.png +0 -0
  43. model_card/images/layer_4_attention_self_value.png +0 -0
  44. model_card/images/layer_4_intermediate_dense.png +0 -0
  45. model_card/images/layer_4_output_dense.png +0 -0
  46. model_card/images/layer_5_attention_output_dense.png +0 -0
  47. model_card/images/layer_5_attention_self_key.png +0 -0
  48. model_card/images/layer_5_attention_self_query.png +0 -0
  49. model_card/images/layer_5_attention_self_value.png +0 -0
  50. model_card/images/layer_5_intermediate_dense.png +0 -0
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ thumbnail:
4
+ license: mit
5
+ tags:
6
+ - question-answering
7
+ - bert
8
+ - bert-base
9
+ datasets:
10
+ - squad
11
+ metrics:
12
+ - squad
13
+ widget:
14
+ - text: "Where is the Eiffel Tower located?"
15
+ context: "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower."
16
+ - text: "Who is Frederic Chopin?"
17
+ context: "Frédéric François Chopin, born Fryderyk Franciszek Chopin (1 March 1810 – 17 October 1849), was a Polish composer and virtuoso pianist of the Romantic era who wrote primarily for solo piano."
18
+ ---
19
+
20
+ ## BERT-base uncased model fine-tuned on SQuAD v1
21
+
22
+ This model was created using the [nn_pruning](https://github.com/huggingface/nn_pruning) python library: the **linear layers contains 15.0%** of the original weights.
23
+
24
+
25
+
26
+ The model contains **34.0%** of the original weights **overall** (the embeddings account for a significant part of the model, and they are not pruned by this method).
27
+
28
+ With a simple resizing of the linear matrices it ran **2.32x as fast as BERT-base** on the evaluation.
29
+ This is possible because the pruning method lead to structured matrices: to visualize them, hover below on the plot to see the non-zero/zero parts of each matrix.
30
+
31
+ <div class="graph"><script src="/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/density_info.js" id="fd51557a-ad63-4088-bc25-67d39b0c0b2c"></script></div>
32
+
33
+ In terms of accuracy, its **F1 is 86.64**, compared with 88.5 for BERT-base, a **F1 drop of 1.86**.
34
+
35
+ ## Fine-Pruning details
36
+ This model was fine-tuned from the HuggingFace [BERT](https://www.aclweb.org/anthology/N19-1423/) base uncased checkpoint on [SQuAD1.1](https://rajpurkar.github.io/SQuAD-explorer), and distilled from the model [bert-large-uncased-whole-word-masking-finetuned-squad](https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad).
37
+ This model is case-insensitive: it does not make a difference between english and English.
38
+
39
+ A side-effect of the block pruning is that some of the attention heads are completely removed: 63 heads were removed on a total of 144 (43.8%).
40
+ Here is a detailed view on how the remaining heads are distributed in the network after pruning.
41
+ <div class="graph"><script src="/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/pruning_info.js" id="2531d97a-f550-49d6-9b4e-8d344db20f2b"></script></div>
42
+
43
+ ## Details of the SQuAD1.1 dataset
44
+
45
+ | Dataset | Split | # samples |
46
+ | -------- | ----- | --------- |
47
+ | SQuAD1.1 | train | 90.6K |
48
+ | SQuAD1.1 | eval | 11.1k |
49
+
50
+ ### Fine-tuning
51
+ - Python: `3.8.5`
52
+
53
+ - Machine specs:
54
+
55
+ ```CPU: Intel(R) Core(TM) i7-6700K CPU
56
+ Memory: 64 GiB
57
+ GPUs: 1 GeForce GTX 3090, with 24GiB memory
58
+ GPU driver: 455.23.05, CUDA: 11.1
59
+ ```
60
+
61
+ ### Results
62
+
63
+ **Pytorch model file size**: `368M` (original BERT: `438M`)
64
+
65
+ | Metric | # Value | # Original ([Table 2](https://www.aclweb.org/anthology/N19-1423.pdf))| Variation |
66
+ | ------ | --------- | --------- | --------- |
67
+ | **EM** | **78.77** | **80.8** | **-2.03**|
68
+ | **F1** | **86.64** | **88.5** | **-1.86**|
69
+
70
+ ## Example Usage
71
+ Install nn_pruning: it contains the optimization script, which just pack the linear layers into smaller ones by removing empty rows/columns.
72
+
73
+ `pip install nn_pruning`
74
+
75
+ Then you can use the `transformers library` almost as usual: you just have to call `optimize_model` when the pipeline has loaded.
76
+
77
+ ```python
78
+ from transformers import pipeline
79
+ from nn_pruning.inference_model_patcher import optimize_model
80
+
81
+ qa_pipeline = pipeline(
82
+ "question-answering",
83
+ model="madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1",
84
+ tokenizer="madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1"
85
+ )
86
+
87
+ print("BERT-base parameters: 110M")
88
+ print(f"Parameters count (includes head pruning)={int(qa_pipeline.model.num_parameters() / 1E6)}M")
89
+ qa_pipeline.model = optimize_model(qa_pipeline.model, "dense")
90
+
91
+ print(f"Parameters count after optimization={int(qa_pipeline.model.num_parameters() / 1E6)}M")
92
+ predictions = qa_pipeline({
93
+ 'context': "Frédéric François Chopin, born Fryderyk Franciszek Chopin (1 March 1810 – 17 October 1849), was a Polish composer and virtuoso pianist of the Romantic era who wrote primarily for solo piano.",
94
+ 'question': "Who is Frederic Chopin?",
95
+ })
96
+ print("Predictions", predictions)
97
+ ```
config.json ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/tmp/tmpspdgp5f3",
3
+ "architectures": [
4
+ "BertForQuestionAnswering"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 3072,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "pruned_heads": {
21
+ "0": [
22
+ 0,
23
+ 2,
24
+ 4,
25
+ 5,
26
+ 6,
27
+ 7,
28
+ 11
29
+ ],
30
+ "1": [
31
+ 0,
32
+ 2,
33
+ 3,
34
+ 5,
35
+ 6,
36
+ 7,
37
+ 8
38
+ ],
39
+ "2": [
40
+ 4,
41
+ 7,
42
+ 8
43
+ ],
44
+ "3": [
45
+ 2,
46
+ 4,
47
+ 6,
48
+ 7
49
+ ],
50
+ "4": [
51
+ 1,
52
+ 2,
53
+ 11
54
+ ],
55
+ "5": [
56
+ 1,
57
+ 2,
58
+ 5,
59
+ 6,
60
+ 7,
61
+ 11
62
+ ],
63
+ "6": [
64
+ 2,
65
+ 3,
66
+ 7,
67
+ 10
68
+ ],
69
+ "7": [
70
+ 1,
71
+ 3,
72
+ 6,
73
+ 7,
74
+ 11
75
+ ],
76
+ "8": [
77
+ 0,
78
+ 3,
79
+ 4,
80
+ 8
81
+ ],
82
+ "9": [
83
+ 1,
84
+ 4,
85
+ 5,
86
+ 7,
87
+ 9,
88
+ 10
89
+ ],
90
+ "10": [
91
+ 1,
92
+ 2,
93
+ 4,
94
+ 5,
95
+ 6,
96
+ 7,
97
+ 8
98
+ ],
99
+ "11": [
100
+ 0,
101
+ 2,
102
+ 5,
103
+ 7,
104
+ 8,
105
+ 10,
106
+ 11
107
+ ]
108
+ },
109
+ "transformers_version": "4.4.2",
110
+ "type_vocab_size": 2,
111
+ "use_cache": true,
112
+ "vocab_size": 30522
113
+ }
model_card/density_info.js ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ (function() {
2
+ var fn = function() {
3
+
4
+ (function(root) {
5
+ function now() {
6
+ return new Date();
7
+ }
8
+
9
+ var force = false;
10
+
11
+ if (typeof root._bokeh_onload_callbacks === "undefined" || force === true) {
12
+ root._bokeh_onload_callbacks = [];
13
+ root._bokeh_is_loading = undefined;
14
+ }
15
+
16
+
17
+
18
+
19
+ var element = document.getElementById("fd51557a-ad63-4088-bc25-67d39b0c0b2c");
20
+ if (element == null) {
21
+ console.warn("Bokeh: autoload.js configured with elementid 'fd51557a-ad63-4088-bc25-67d39b0c0b2c' but no matching script tag was found.")
22
+ }
23
+
24
+
25
+ function run_callbacks() {
26
+ try {
27
+ root._bokeh_onload_callbacks.forEach(function(callback) {
28
+ if (callback != null)
29
+ callback();
30
+ });
31
+ } finally {
32
+ delete root._bokeh_onload_callbacks
33
+ }
34
+ console.debug("Bokeh: all callbacks have finished");
35
+ }
36
+
37
+ function load_libs(css_urls, js_urls, callback) {
38
+ if (css_urls == null) css_urls = [];
39
+ if (js_urls == null) js_urls = [];
40
+
41
+ root._bokeh_onload_callbacks.push(callback);
42
+ if (root._bokeh_is_loading > 0) {
43
+ console.debug("Bokeh: BokehJS is being loaded, scheduling callback at", now());
44
+ return null;
45
+ }
46
+ if (js_urls == null || js_urls.length === 0) {
47
+ run_callbacks();
48
+ return null;
49
+ }
50
+ console.debug("Bokeh: BokehJS not loaded, scheduling load and callback at", now());
51
+ root._bokeh_is_loading = css_urls.length + js_urls.length;
52
+
53
+ function on_load() {
54
+ root._bokeh_is_loading--;
55
+ if (root._bokeh_is_loading === 0) {
56
+ console.debug("Bokeh: all BokehJS libraries/stylesheets loaded");
57
+ run_callbacks()
58
+ }
59
+ }
60
+
61
+ function on_error() {
62
+ console.error("failed to load " + url);
63
+ }
64
+
65
+ for (var i = 0; i < css_urls.length; i++) {
66
+ var url = css_urls[i];
67
+ const element = document.createElement("link");
68
+ element.onload = on_load;
69
+ element.onerror = on_error;
70
+ element.rel = "stylesheet";
71
+ element.type = "text/css";
72
+ element.href = url;
73
+ console.debug("Bokeh: injecting link tag for BokehJS stylesheet: ", url);
74
+ document.body.appendChild(element);
75
+ }
76
+
77
+ const hashes = {"https://cdn.bokeh.org/bokeh/release/bokeh-2.2.3.min.js": "T2yuo9Oe71Cz/I4X9Ac5+gpEa5a8PpJCDlqKYO0CfAuEszu1JrXLl8YugMqYe3sM", "https://cdn.bokeh.org/bokeh/release/bokeh-widgets-2.2.3.min.js": "98GDGJ0kOMCUMUePhksaQ/GYgB3+NH9h996V88sh3aOiUNX3N+fLXAtry6xctSZ6", "https://cdn.bokeh.org/bokeh/release/bokeh-tables-2.2.3.min.js": "89bArO+nlbP3sgakeHjCo1JYxYR5wufVgA3IbUvDY+K7w4zyxJqssu7wVnfeKCq8"};
78
+
79
+ for (var i = 0; i < js_urls.length; i++) {
80
+ var url = js_urls[i];
81
+ var element = document.createElement('script');
82
+ element.onload = on_load;
83
+ element.onerror = on_error;
84
+ element.async = false;
85
+ element.src = url;
86
+ if (url in hashes) {
87
+ element.crossOrigin = "anonymous";
88
+ element.integrity = "sha384-" + hashes[url];
89
+ }
90
+ console.debug("Bokeh: injecting script tag for BokehJS library: ", url);
91
+ document.head.appendChild(element);
92
+ }
93
+ };
94
+
95
+ function inject_raw_css(css) {
96
+ const element = document.createElement("style");
97
+ element.appendChild(document.createTextNode(css));
98
+ document.body.appendChild(element);
99
+ }
100
+
101
+
102
+ var js_urls = ["https://cdn.bokeh.org/bokeh/release/bokeh-2.2.3.min.js", "https://cdn.bokeh.org/bokeh/release/bokeh-widgets-2.2.3.min.js", "https://cdn.bokeh.org/bokeh/release/bokeh-tables-2.2.3.min.js"];
103
+ var css_urls = [];
104
+
105
+
106
+ var inline_js = [
107
+ function(Bokeh) {
108
+ Bokeh.set_log_level("info");
109
+ },
110
+
111
+ function(Bokeh) {
112
+ (function() {
113
+ var fn = function() {
114
+ Bokeh.safely(function() {
115
+ (function(root) {
116
+ function embed_document(root) {
117
+
118
+ var docs_json = '{"832f21f0-0877-47a5-9ae3-3ca13f735411":{"roots":{"references":[{"attributes":{"axis_label":"Layer","formatter":{"id":"1148"},"minor_tick_line_color":null,"ticker":{"id":"1107"}},"id":"1106","type":"LinearAxis"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#20cb97"},"line_alpha":{"value":0.1},"line_color":{"value":"#20cb97"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1131","type":"VBar"},{"attributes":{},"id":"1156","type":"UnionRenderers"},{"attributes":{"data":{"density":["26.4%","31.4%","51.0%","51.9%","51.4%","34.0%","40.6%","39.4%","34.2%","27.1%","23.1%","19.3%"],"height":[0.155648,0.185344,0.301056,0.306176,0.303104,0.200704,0.239616,0.232448,0.201728,0.159744,0.136192,0.113664],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.key","1.attention.key","2.attention.key","3.attention.key","4.attention.key","5.attention.key","6.attention.key","7.attention.key","8.attention.key","9.attention.key","10.attention.key","11.attention.key"],"parameters":["0.16","0.19","0.30","0.31","0.30","0.20","0.24","0.23","0.20","0.16","0.14","0.11"],"url":["/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_0_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_1_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_2_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_3_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_4_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_5_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_6_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_7_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_8_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_9_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_10_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_11_attention_self_key.png"],"x":[0.25,1.25,2.25,3.25,4.25,5.25,6.25,7.25,8.25,9.25,10.25,11.25]},"selected":{"id":"1153"},"selection_policy":{"id":"1152"}},"id":"1122","type":"ColumnDataSource"},{"attributes":{"data":{"density":["30.0%","26.7%","39.8%","50.0%","42.5%","38.2%","34.9%","35.4%","29.0%","12.2%","13.2%","8.3%"],"height":[0.177152,0.157696,0.234496,0.294912,0.25088,0.22528,0.205824,0.208896,0.171008,0.07168,0.077824,0.049152],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.value","1.attention.value","2.attention.value","3.attention.value","4.attention.value","5.attention.value","6.attention.value","7.attention.value","8.attention.value","9.attention.value","10.attention.value","11.attention.value"],"parameters":["0.18","0.16","0.23","0.29","0.25","0.23","0.21","0.21","0.17","0.07","0.08","0.05"],"url":["/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_0_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_1_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_2_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_3_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_4_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_5_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_6_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_7_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_8_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_9_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_10_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_11_attention_self_value.png"],"x":[0.41666666666666663,1.4166666666666665,2.416666666666667,3.416666666666667,4.416666666666666,5.416666666666666,6.416666666666666,7.416666666666666,8.416666666666668,9.416666666666668,10.416666666666668,11.416666666666668]},"selected":{"id":"1155"},"selection_policy":{"id":"1154"}},"id":"1128","type":"ColumnDataSource"},{"attributes":{"axis":{"id":"1110"},"dimension":1,"ticker":null},"id":"1113","type":"Grid"},{"attributes":{},"id":"1148","type":"BasicTickFormatter"},{"attributes":{},"id":"1098","type":"DataRange1d"},{"attributes":{"fill_color":{"value":"#ed5642"},"line_color":{"value":"#ed5642"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1124","type":"VBar"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#ed5642"},"line_alpha":{"value":0.1},"line_color":{"value":"#ed5642"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1125","type":"VBar"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#aa69f7"},"line_alpha":{"value":0.1},"line_color":{"value":"#aa69f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1137","type":"VBar"},{"attributes":{"label":{"value":"key"},"renderers":[{"id":"1126"}]},"id":"1142","type":"LegendItem"},{"attributes":{},"id":"1153","type":"Selection"},{"attributes":{"source":{"id":"1134"}},"id":"1139","type":"CDSView"},{"attributes":{},"id":"1152","type":"UnionRenderers"},{"attributes":{"data":{"density":["26.2%","31.2%","51.6%","47.9%","49.1%","30.4%","37.8%","35.9%","34.0%","25.7%","24.1%","19.8%"],"height":[0.154624,0.18432,0.304128,0.282624,0.289792,0.1792,0.223232,0.211968,0.200704,0.151552,0.142336,0.116736],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.query","1.attention.query","2.attention.query","3.attention.query","4.attention.query","5.attention.query","6.attention.query","7.attention.query","8.attention.query","9.attention.query","10.attention.query","11.attention.query"],"parameters":["0.15","0.18","0.30","0.28","0.29","0.18","0.22","0.21","0.20","0.15","0.14","0.12"],"url":["/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_0_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_1_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_2_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_3_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_4_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_5_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_6_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_7_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_8_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_9_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_10_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_11_attention_self_query.png"],"x":[0.08333333333333333,1.0833333333333333,2.0833333333333335,3.0833333333333335,4.083333333333333,5.083333333333333,6.083333333333333,7.083333333333333,8.083333333333334,9.083333333333334,10.083333333333334,11.083333333333334]},"selected":{"id":"1151"},"selection_policy":{"id":"1150"}},"id":"1116","type":"ColumnDataSource"},{"attributes":{"fill_color":{"value":"#6573f7"},"line_color":{"value":"#6573f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1118","type":"VBar"},{"attributes":{"label":{"value":"fully connected"},"renderers":[{"id":"1138"}]},"id":"1144","type":"LegendItem"},{"attributes":{},"id":"1151","type":"Selection"},{"attributes":{"source":{"id":"1122"}},"id":"1127","type":"CDSView"},{"attributes":{"label":{"value":"value"},"renderers":[{"id":"1132"}]},"id":"1143","type":"LegendItem"},{"attributes":{"data_source":{"id":"1122"},"glyph":{"id":"1124"},"hover_glyph":null,"muted_glyph":null,"name":"key","nonselection_glyph":{"id":"1125"},"selection_glyph":null,"view":{"id":"1127"}},"id":"1126","type":"GlyphRenderer"},{"attributes":{},"id":"1146","type":"BasicTickFormatter"},{"attributes":{},"id":"1102","type":"LinearScale"},{"attributes":{},"id":"1107","type":"BasicTicker"},{"attributes":{},"id":"1157","type":"Selection"},{"attributes":{"data_source":{"id":"1116"},"glyph":{"id":"1118"},"hover_glyph":null,"muted_glyph":null,"name":"query","nonselection_glyph":{"id":"1119"},"selection_glyph":null,"view":{"id":"1121"}},"id":"1120","type":"GlyphRenderer"},{"attributes":{"source":{"id":"1116"}},"id":"1121","type":"CDSView"},{"attributes":{"items":[{"id":"1141"},{"id":"1142"},{"id":"1143"},{"id":"1144"}],"location":[10,0],"orientation":"horizontal"},"id":"1140","type":"Legend"},{"attributes":{},"id":"1104","type":"LinearScale"},{"attributes":{"fill_color":{"value":"#aa69f7"},"line_color":{"value":"#aa69f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1136","type":"VBar"},{"attributes":{"axis":{"id":"1106"},"grid_line_color":null,"ticker":null},"id":"1109","type":"Grid"},{"attributes":{},"id":"1154","type":"UnionRenderers"},{"attributes":{"label":{"value":"query"},"renderers":[{"id":"1120"}]},"id":"1141","type":"LegendItem"},{"attributes":{},"id":"1111","type":"BasicTicker"},{"attributes":{"data_source":{"id":"1128"},"glyph":{"id":"1130"},"hover_glyph":null,"muted_glyph":null,"name":"value","nonselection_glyph":{"id":"1131"},"selection_glyph":null,"view":{"id":"1133"}},"id":"1132","type":"GlyphRenderer"},{"attributes":{"data":{"density":["31.8%","6.6%","6.6%","27.4%","9.2%","9.2%","42.0%","11.5%","11.5%","51.9%","12.0%","12.0%","44.3%","12.5%","12.5%","36.1%","10.9%","10.9%","36.3%","9.4%","9.4%","32.8%","6.8%","6.8%","26.9%","3.5%","3.5%","11.3%","1.8%","1.8%","13.2%","2.6%","2.6%","9.4%","3.3%","3.3%"],"height":[0.187392,0.155136,0.155136,0.161792,0.218112,0.218112,0.247808,0.271872,0.271872,0.306176,0.282624,0.282624,0.26112,0.294912,0.294912,0.212992,0.25728,0.25728,0.214016,0.221184,0.221184,0.193536,0.16128,0.16128,0.15872,0.083712,0.083712,0.06656,0.04224,0.04224,0.077824,0.060672,0.060672,0.055296,0.078336,0.078336],"img_height":["96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px"],"img_width":["96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px"],"name":["0.attention.output","0.intermediate","0.output","1.attention.output","1.intermediate","1.output","2.attention.output","2.intermediate","2.output","3.attention.output","3.intermediate","3.output","4.attention.output","4.intermediate","4.output","5.attention.output","5.intermediate","5.output","6.attention.output","6.intermediate","6.output","7.attention.output","7.intermediate","7.output","8.attention.output","8.intermediate","8.output","9.attention.output","9.intermediate","9.output","10.attention.output","10.intermediate","10.output","11.attention.output","11.intermediate","11.output"],"parameters":["0.19","0.16","0.16","0.16","0.22","0.22","0.25","0.27","0.27","0.31","0.28","0.28","0.26","0.29","0.29","0.21","0.26","0.26","0.21","0.22","0.22","0.19","0.16","0.16","0.16","0.08","0.08","0.07","0.04","0.04","0.08","0.06","0.06","0.06","0.08","0.08"],"url":["/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_0_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_0_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_0_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_1_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_1_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_1_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_2_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_2_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_2_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_3_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_3_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_3_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_4_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_4_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_4_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_5_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_5_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_5_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_6_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_6_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_6_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_7_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_7_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_7_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_8_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_8_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_8_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_9_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_9_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_9_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_10_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_10_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_10_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_11_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_11_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.32-f86.6-d15-hybrid-v1/raw/main/model_card/images/layer_11_output_dense.png"],"x":[0.5833333333333334,0.75,0.9166666666666667,1.5833333333333333,1.75,1.9166666666666665,2.5833333333333335,2.75,2.916666666666667,3.5833333333333335,3.75,3.916666666666667,4.583333333333333,4.75,4.916666666666666,5.583333333333333,5.75,5.916666666666666,6.583333333333333,6.75,6.916666666666666,7.583333333333333,7.75,7.916666666666666,8.583333333333334,8.75,8.916666666666668,9.583333333333334,9.75,9.916666666666668,10.583333333333334,10.75,10.916666666666668,11.583333333333334,11.75,11.916666666666668]},"selected":{"id":"1157"},"selection_policy":{"id":"1156"}},"id":"1134","type":"ColumnDataSource"},{"attributes":{"text":"Transformer Layers"},"id":"1096","type":"Title"},{"attributes":{"above":[{"id":"1140"}],"below":[{"id":"1106"}],"center":[{"id":"1109"},{"id":"1113"}],"left":[{"id":"1110"}],"outline_line_color":null,"plot_height":300,"plot_width":505,"renderers":[{"id":"1120"},{"id":"1126"},{"id":"1132"},{"id":"1138"}],"title":{"id":"1096"},"toolbar":{"id":"1114"},"x_range":{"id":"1098"},"x_scale":{"id":"1102"},"y_range":{"id":"1100"},"y_scale":{"id":"1104"}},"id":"1095","subtype":"Figure","type":"Plot"},{"attributes":{},"id":"1150","type":"UnionRenderers"},{"attributes":{"source":{"id":"1128"}},"id":"1133","type":"CDSView"},{"attributes":{},"id":"1155","type":"Selection"},{"attributes":{"axis_label":"Parameters (M)","formatter":{"id":"1146"},"minor_tick_line_color":null,"ticker":{"id":"1111"}},"id":"1110","type":"LinearAxis"},{"attributes":{"active_drag":"auto","active_inspect":"auto","active_multi":null,"active_scroll":"auto","active_tap":"auto","tools":[{"id":"1094"}]},"id":"1114","type":"Toolbar"},{"attributes":{"start":0},"id":"1100","type":"DataRange1d"},{"attributes":{"fill_color":{"value":"#20cb97"},"line_color":{"value":"#20cb97"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1130","type":"VBar"},{"attributes":{"callback":null,"tooltips":"\\n &lt;div&gt;\\n &lt;div style=\\"margin-bottom:10px\\"&gt;\\n &lt;span style=\\"font-size: 15px;\\"&gt;&lt;b&gt;@name&lt;/b&gt;&lt;br/&gt;density=@density&lt;/span&gt;\\n &lt;/div&gt;\\n &lt;div&gt; \\n &lt;img\\n src=\\"@url\\" height=\\"@img_height\\" width=\\"@img_width\\" alt=\\"@url\\"\\n style=\\"float: left; margin: 0px 15px 15px 0px;\\"\\n border=\\"0\\"\\n /&gt;\\n &lt;/div&gt;\\n &lt;/div&gt;\\n "},"id":"1094","type":"HoverTool"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#6573f7"},"line_alpha":{"value":0.1},"line_color":{"value":"#6573f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1119","type":"VBar"},{"attributes":{"data_source":{"id":"1134"},"glyph":{"id":"1136"},"hover_glyph":null,"muted_glyph":null,"name":"fully connected","nonselection_glyph":{"id":"1137"},"selection_glyph":null,"view":{"id":"1139"}},"id":"1138","type":"GlyphRenderer"}],"root_ids":["1095"]},"title":"Bokeh Application","version":"2.2.3"}}';
119
+ var render_items = [{"docid":"832f21f0-0877-47a5-9ae3-3ca13f735411","root_ids":["1095"],"roots":{"1095":"fd51557a-ad63-4088-bc25-67d39b0c0b2c"}}];
120
+ root.Bokeh.embed.embed_items(docs_json, render_items);
121
+
122
+ }
123
+ if (root.Bokeh !== undefined) {
124
+ embed_document(root);
125
+ } else {
126
+ var attempts = 0;
127
+ var timer = setInterval(function(root) {
128
+ if (root.Bokeh !== undefined) {
129
+ clearInterval(timer);
130
+ embed_document(root);
131
+ } else {
132
+ attempts++;
133
+ if (attempts > 100) {
134
+ clearInterval(timer);
135
+ console.log("Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing");
136
+ }
137
+ }
138
+ }, 10, root)
139
+ }
140
+ })(window);
141
+ });
142
+ };
143
+ if (document.readyState != "loading") fn();
144
+ else document.addEventListener("DOMContentLoaded", fn);
145
+ })();
146
+ },
147
+ function(Bokeh) {
148
+
149
+
150
+ }
151
+ ];
152
+
153
+ function run_inline_js() {
154
+
155
+ for (var i = 0; i < inline_js.length; i++) {
156
+ inline_js[i].call(root, root.Bokeh);
157
+ }
158
+
159
+ }
160
+
161
+ if (root._bokeh_is_loading === 0) {
162
+ console.debug("Bokeh: BokehJS loaded, going straight to plotting");
163
+ run_inline_js();
164
+ } else {
165
+ load_libs(css_urls, js_urls, function() {
166
+ console.debug("Bokeh: BokehJS plotting callback run at", now());
167
+ run_inline_js();
168
+ });
169
+ }
170
+ }(window));
171
+ };
172
+ if (document.readyState != "loading") fn();
173
+ else document.addEventListener("DOMContentLoaded", fn);
174
+ })();
model_card/images/layer_0_attention_output_dense.png ADDED
model_card/images/layer_0_attention_self_key.png ADDED
model_card/images/layer_0_attention_self_query.png ADDED
model_card/images/layer_0_attention_self_value.png ADDED
model_card/images/layer_0_intermediate_dense.png ADDED
model_card/images/layer_0_output_dense.png ADDED
model_card/images/layer_10_attention_output_dense.png ADDED
model_card/images/layer_10_attention_self_key.png ADDED
model_card/images/layer_10_attention_self_query.png ADDED
model_card/images/layer_10_attention_self_value.png ADDED
model_card/images/layer_10_intermediate_dense.png ADDED
model_card/images/layer_10_output_dense.png ADDED
model_card/images/layer_11_attention_output_dense.png ADDED
model_card/images/layer_11_attention_self_key.png ADDED
model_card/images/layer_11_attention_self_query.png ADDED
model_card/images/layer_11_attention_self_value.png ADDED
model_card/images/layer_11_intermediate_dense.png ADDED
model_card/images/layer_11_output_dense.png ADDED
model_card/images/layer_1_attention_output_dense.png ADDED
model_card/images/layer_1_attention_self_key.png ADDED
model_card/images/layer_1_attention_self_query.png ADDED
model_card/images/layer_1_attention_self_value.png ADDED
model_card/images/layer_1_intermediate_dense.png ADDED
model_card/images/layer_1_output_dense.png ADDED
model_card/images/layer_2_attention_output_dense.png ADDED
model_card/images/layer_2_attention_self_key.png ADDED
model_card/images/layer_2_attention_self_query.png ADDED
model_card/images/layer_2_attention_self_value.png ADDED
model_card/images/layer_2_intermediate_dense.png ADDED
model_card/images/layer_2_output_dense.png ADDED
model_card/images/layer_3_attention_output_dense.png ADDED
model_card/images/layer_3_attention_self_key.png ADDED
model_card/images/layer_3_attention_self_query.png ADDED
model_card/images/layer_3_attention_self_value.png ADDED
model_card/images/layer_3_intermediate_dense.png ADDED
model_card/images/layer_3_output_dense.png ADDED
model_card/images/layer_4_attention_output_dense.png ADDED
model_card/images/layer_4_attention_self_key.png ADDED
model_card/images/layer_4_attention_self_query.png ADDED
model_card/images/layer_4_attention_self_value.png ADDED
model_card/images/layer_4_intermediate_dense.png ADDED
model_card/images/layer_4_output_dense.png ADDED
model_card/images/layer_5_attention_output_dense.png ADDED
model_card/images/layer_5_attention_self_key.png ADDED
model_card/images/layer_5_attention_self_query.png ADDED
model_card/images/layer_5_attention_self_value.png ADDED
model_card/images/layer_5_intermediate_dense.png ADDED