add experience
Browse files- README.md +183 -0
- runs.json +0 -0
- tensorboard/1657700383.6577654/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.1 +3 -0
- tensorboard/1657700383.6590903/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.2 +3 -0
- tensorboard/1657700383.6603022/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.3 +3 -0
- tensorboard/1657700383.6618648/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.4 +3 -0
- tensorboard/1657700383.6629946/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.5 +3 -0
- tensorboard/1657700383.66408/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.6 +3 -0
- tensorboard/1657700383.6652703/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.7 +3 -0
- tensorboard/1657700383.6664448/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.8 +3 -0
- tensorboard/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.0 +3 -0
README.md
ADDED
@@ -0,0 +1,183 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
pipeline_tag: token-classification
|
3 |
+
datasets:
|
4 |
+
- conll2003
|
5 |
+
metrics:
|
6 |
+
- precision
|
7 |
+
- recall
|
8 |
+
- f1
|
9 |
+
- accuracy
|
10 |
+
tags:
|
11 |
+
- distilbert
|
12 |
+
---
|
13 |
+
|
14 |
+
**task**: `token-classification`
|
15 |
+
**Backend:** `sagemaker-training`
|
16 |
+
**Backend args:** `{'instance_type': 'ml.g4dn.2xlarge', 'supported_instructions': None}`
|
17 |
+
**Number of evaluation samples:** `All dataset`
|
18 |
+
|
19 |
+
Fixed parameters:
|
20 |
+
* **model_name_or_path**: `elastic/distilbert-base-uncased-finetuned-conll03-english`
|
21 |
+
* **dataset**:
|
22 |
+
* **path**: `conll2003`
|
23 |
+
* **eval_split**: `validation`
|
24 |
+
* **data_keys**: `{'primary': 'tokens'}`
|
25 |
+
* **ref_keys**: `['ner_tags']`
|
26 |
+
* **calibration_split**: `train`
|
27 |
+
* **per_channel**: `False`
|
28 |
+
* **calibration**:
|
29 |
+
* **method**: `minmax`
|
30 |
+
* **num_calibration_samples**: `100`
|
31 |
+
* **framework**: `onnxruntime`
|
32 |
+
* **framework_args**:
|
33 |
+
* **opset**: `11`
|
34 |
+
* **optimization_level**: `1`
|
35 |
+
* **aware_training**: `False`
|
36 |
+
|
37 |
+
Benchmarked parameters:
|
38 |
+
* **quantization_approach**: `dynamic`, `static`
|
39 |
+
* **operators_to_quantize**: `['Add']`, `['Add', 'MatMul']`
|
40 |
+
* **node_exclusion**: `[]`, `['layernorm', 'gelu', 'residual', 'gather', 'softmax']`
|
41 |
+
|
42 |
+
# Evaluation
|
43 |
+
## Non-time metrics
|
44 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | precision (original) | precision (optimized) | | recall (original) | recall (optimized) | | f1 (original) | f1 (optimized) | | accuracy (original) | accuracy (optimized) |
|
45 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :------------------: | :-------------------: | :-: | :---------------: | :----------------: | :-: | :-----------: | :------------: | :-: | :-----------------: | :------------------: |
|
46 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 0.936 | 0.934 | \| | 0.944 | 0.942 | \| | 0.940 | 0.938 | \| | 0.988 | 0.988 |
|
47 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 0.936 | 0.934 | \| | 0.944 | 0.942 | \| | 0.940 | 0.938 | \| | 0.988 | 0.988 |
|
48 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 0.936 | 0.936 | \| | 0.944 | 0.944 | \| | 0.940 | 0.940 | \| | 0.988 | 0.988 |
|
49 |
+
| `dynamic` | `['Add']` | `[]` | \| | 0.936 | 0.936 | \| | 0.944 | 0.944 | \| | 0.940 | 0.940 | \| | 0.988 | 0.988 |
|
50 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 0.936 | 0.904 | \| | 0.944 | 0.921 | \| | 0.940 | 0.912 | \| | 0.988 | 0.984 |
|
51 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 0.936 | 0.065 | \| | 0.944 | 0.243 | \| | 0.940 | 0.103 | \| | 0.988 | 0.357 |
|
52 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 0.936 | 0.909 | \| | 0.944 | 0.930 | \| | 0.940 | 0.919 | \| | 0.988 | 0.986 |
|
53 |
+
| `static` | `['Add']` | `[]` | \| | 0.936 | 0.050 | \| | 0.944 | 0.160 | \| | 0.940 | 0.076 | \| | 0.988 | 0.311 |
|
54 |
+
|
55 |
+
## Time metrics
|
56 |
+
Time benchmarks were run for 15 seconds per config.
|
57 |
+
|
58 |
+
|
59 |
+
Below, time metrics for batch size = 1, input length = 32.
|
60 |
+
|
61 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
62 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
63 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 32.90 | 7.03 | \| | 30.40 | 142.20 |
|
64 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 48.27 | 7.68 | \| | 20.73 | 130.33 |
|
65 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 33.74 | 14.73 | \| | 29.67 | 67.93 |
|
66 |
+
| `dynamic` | `['Add']` | `[]` | \| | 33.49 | 14.17 | \| | 29.87 | 70.60 |
|
67 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 47.72 | 8.20 | \| | 21.00 | 121.93 |
|
68 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 47.87 | 10.58 | \| | 20.93 | 94.60 |
|
69 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 45.77 | 19.00 | \| | 21.87 | 52.67 |
|
70 |
+
| `static` | `['Add']` | `[]` | \| | 44.67 | 18.77 | \| | 22.40 | 53.33 |
|
71 |
+
|
72 |
+
|
73 |
+
Below, time metrics for batch size = 1, input length = 64.
|
74 |
+
|
75 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
76 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
77 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 59.15 | 13.60 | \| | 16.93 | 73.53 |
|
78 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 44.01 | 12.60 | \| | 22.73 | 79.40 |
|
79 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 60.50 | 29.87 | \| | 16.53 | 33.53 |
|
80 |
+
| `dynamic` | `['Add']` | `[]` | \| | 45.35 | 24.10 | \| | 22.07 | 41.53 |
|
81 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 59.98 | 16.08 | \| | 16.73 | 62.20 |
|
82 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 43.23 | 19.02 | \| | 23.20 | 52.60 |
|
83 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 43.15 | 32.96 | \| | 23.20 | 30.40 |
|
84 |
+
| `static` | `['Add']` | `[]` | \| | 44.01 | 31.68 | \| | 22.80 | 31.60 |
|
85 |
+
|
86 |
+
|
87 |
+
Below, time metrics for batch size = 1, input length = 128.
|
88 |
+
|
89 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
90 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
91 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 55.20 | 25.72 | \| | 18.13 | 38.93 |
|
92 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 73.52 | 26.70 | \| | 13.67 | 37.47 |
|
93 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 71.60 | 53.26 | \| | 14.00 | 18.80 |
|
94 |
+
| `dynamic` | `['Add']` | `[]` | \| | 70.39 | 56.68 | \| | 14.27 | 17.67 |
|
95 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 71.34 | 31.75 | \| | 14.07 | 31.53 |
|
96 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 73.55 | 37.95 | \| | 13.60 | 26.40 |
|
97 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 70.28 | 62.70 | \| | 14.27 | 16.00 |
|
98 |
+
| `static` | `['Add']` | `[]` | \| | 63.86 | 61.64 | \| | 15.67 | 16.27 |
|
99 |
+
|
100 |
+
|
101 |
+
Below, time metrics for batch size = 4, input length = 32.
|
102 |
+
|
103 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
104 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
105 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 70.41 | 22.67 | \| | 14.27 | 44.13 |
|
106 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 71.65 | 21.44 | \| | 14.00 | 46.67 |
|
107 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 71.72 | 55.16 | \| | 14.00 | 18.13 |
|
108 |
+
| `dynamic` | `['Add']` | `[]` | \| | 55.56 | 43.87 | \| | 18.00 | 22.80 |
|
109 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 55.45 | 27.83 | \| | 18.07 | 36.00 |
|
110 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 66.57 | 34.45 | \| | 15.07 | 29.07 |
|
111 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 55.23 | 59.31 | \| | 18.13 | 16.87 |
|
112 |
+
| `static` | `['Add']` | `[]` | \| | 58.80 | 66.03 | \| | 17.07 | 15.20 |
|
113 |
+
|
114 |
+
|
115 |
+
Below, time metrics for batch size = 4, input length = 64.
|
116 |
+
|
117 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
118 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
119 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 117.71 | 43.93 | \| | 8.53 | 22.80 |
|
120 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 90.01 | 43.27 | \| | 11.13 | 23.13 |
|
121 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 94.34 | 107.02 | \| | 10.60 | 9.40 |
|
122 |
+
| `dynamic` | `['Add']` | `[]` | \| | 119.11 | 82.46 | \| | 8.40 | 12.13 |
|
123 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 120.57 | 54.70 | \| | 8.33 | 18.33 |
|
124 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 120.00 | 57.85 | \| | 8.40 | 17.33 |
|
125 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 119.57 | 92.50 | \| | 8.40 | 10.87 |
|
126 |
+
| `static` | `['Add']` | `[]` | \| | 117.35 | 102.09 | \| | 8.53 | 9.80 |
|
127 |
+
|
128 |
+
|
129 |
+
Below, time metrics for batch size = 4, input length = 128.
|
130 |
+
|
131 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
132 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
133 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 220.69 | 94.33 | \| | 4.53 | 10.67 |
|
134 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 170.04 | 81.68 | \| | 5.93 | 12.27 |
|
135 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 188.59 | 171.79 | \| | 5.33 | 5.87 |
|
136 |
+
| `dynamic` | `['Add']` | `[]` | \| | 219.80 | 163.62 | \| | 4.60 | 6.13 |
|
137 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 220.25 | 94.05 | \| | 4.60 | 10.67 |
|
138 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 222.90 | 135.06 | \| | 4.53 | 7.47 |
|
139 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 177.41 | 211.89 | \| | 5.67 | 4.73 |
|
140 |
+
| `static` | `['Add']` | `[]` | \| | 168.30 | 201.88 | \| | 6.00 | 5.00 |
|
141 |
+
|
142 |
+
|
143 |
+
Below, time metrics for batch size = 8, input length = 32.
|
144 |
+
|
145 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
146 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
147 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 106.46 | 42.35 | \| | 9.47 | 23.67 |
|
148 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 88.68 | 43.33 | \| | 11.33 | 23.13 |
|
149 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 91.32 | 92.08 | \| | 11.00 | 10.87 |
|
150 |
+
| `dynamic` | `['Add']` | `[]` | \| | 88.33 | 94.18 | \| | 11.33 | 10.67 |
|
151 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 107.47 | 44.74 | \| | 9.33 | 22.40 |
|
152 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 118.39 | 64.56 | \| | 8.47 | 15.53 |
|
153 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 87.05 | 111.36 | \| | 11.53 | 9.00 |
|
154 |
+
| `static` | `['Add']` | `[]` | \| | 116.96 | 98.82 | \| | 8.60 | 10.13 |
|
155 |
+
|
156 |
+
|
157 |
+
Below, time metrics for batch size = 8, input length = 64.
|
158 |
+
|
159 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
160 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
161 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 165.67 | 87.71 | \| | 6.07 | 11.47 |
|
162 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 214.59 | 87.88 | \| | 4.67 | 11.40 |
|
163 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 216.06 | 163.75 | \| | 4.67 | 6.13 |
|
164 |
+
| `dynamic` | `['Add']` | `[]` | \| | 176.69 | 209.28 | \| | 5.67 | 4.80 |
|
165 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 215.12 | 86.90 | \| | 4.67 | 11.53 |
|
166 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 215.99 | 130.39 | \| | 4.67 | 7.73 |
|
167 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 213.87 | 224.50 | \| | 4.73 | 4.47 |
|
168 |
+
| `static` | `['Add']` | `[]` | \| | 211.16 | 193.01 | \| | 4.80 | 5.20 |
|
169 |
+
|
170 |
+
|
171 |
+
Below, time metrics for batch size = 8, input length = 128.
|
172 |
+
|
173 |
+
| quantization_approach | operators_to_quantize | node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
|
174 |
+
| :-------------------: | :-------------------: | :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
|
175 |
+
| `dynamic` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 391.16 | 183.35 | \| | 2.60 | 5.47 |
|
176 |
+
| `dynamic` | `['Add', 'MatMul']` | `[]` | \| | 414.42 | 154.52 | \| | 2.47 | 6.53 |
|
177 |
+
| `dynamic` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 314.12 | 323.94 | \| | 3.20 | 3.13 |
|
178 |
+
| `dynamic` | `['Add']` | `[]` | \| | 408.15 | 325.03 | \| | 2.47 | 3.13 |
|
179 |
+
| `static` | `['Add', 'MatMul']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 337.57 | 205.59 | \| | 3.00 | 4.87 |
|
180 |
+
| `static` | `['Add', 'MatMul']` | `[]` | \| | 375.10 | 225.09 | \| | 2.67 | 4.47 |
|
181 |
+
| `static` | `['Add']` | `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 409.68 | 493.00 | \| | 2.47 | 2.07 |
|
182 |
+
| `static` | `['Add']` | `[]` | \| | 397.28 | 397.74 | \| | 2.53 | 2.53 |
|
183 |
+
|
runs.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tensorboard/1657700383.6577654/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.1
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a4430eeeccb2f1503945f3e4a1e40a022029d252eb77cc6c4e690e763d1557bc
|
3 |
+
size 836
|
tensorboard/1657700383.6590903/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.2
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b40ea66f8c6d33dd784a79a025a72a8f620cb4d842bbfda474deb78e02007ca6
|
3 |
+
size 784
|
tensorboard/1657700383.6603022/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.3
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:745489e88f543434bb54a0be055bc28ad4a2d07c21376e864f1b58f4eca32b8f
|
3 |
+
size 826
|
tensorboard/1657700383.6618648/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.4
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1ff58bef1b8870bedcb4d413155a4764462c6e894e8ea6a35d6da346c36ce0f6
|
3 |
+
size 774
|
tensorboard/1657700383.6629946/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.5
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b7bd8135a1bf222a30d4144f51924733cc81426e60130a535a756290fcf9885c
|
3 |
+
size 835
|
tensorboard/1657700383.66408/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.6
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d51d3537cec8ebb15b69bf587441d370a0169149ae2c6a8e931212404a766ada
|
3 |
+
size 783
|
tensorboard/1657700383.6652703/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.7
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7321af88466337a931544eecff51d5d249b20f8f9f8f7619cc16f954198c436f
|
3 |
+
size 825
|
tensorboard/1657700383.6664448/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.8
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c85bbb8846a885365609041dbaecad315572bd1a1e0a9ac23469369683e24aad
|
3 |
+
size 773
|
tensorboard/events.out.tfevents.1657700383.ip-10-2-195-98.ec2.internal.1.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:462aa7fbfe97b4102531f9adf20e8463c32f225ca927df0fc7717958dcbaec82
|
3 |
+
size 40
|