Update README.md
Browse files
README.md
CHANGED
@@ -37,12 +37,8 @@ The linear modules **albert.encoder.albert_layer_groups.0.albert_layers.0.ffn_ou
|
|
37 |
|
38 |
### Test result
|
39 |
|
40 |
-
- Batch size = 8
|
41 |
-
- [Amazon Web Services](https://aws.amazon.com/) c6i.xlarge (Intel ICE Lake: 4 vCPUs, 8g Memory) instance.
|
42 |
-
|
43 |
| |INT8|FP32|
|
44 |
|---|:---:|:---:|
|
45 |
-
| **Throughput (samples/sec)** |13.464|11.854|
|
46 |
| **Accuracy (eval-accuracy)** |0.9255|0.9232|
|
47 |
| **Model size (MB)** |25|44.6|
|
48 |
|
@@ -54,7 +50,3 @@ int8_model = OptimizedModel.from_pretrained(
|
|
54 |
'Intel/albert-base-v2-sst2-int8-static',
|
55 |
)
|
56 |
```
|
57 |
-
|
58 |
-
Notes:
|
59 |
-
- The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.
|
60 |
-
|
|
|
37 |
|
38 |
### Test result
|
39 |
|
|
|
|
|
|
|
40 |
| |INT8|FP32|
|
41 |
|---|:---:|:---:|
|
|
|
42 |
| **Accuracy (eval-accuracy)** |0.9255|0.9232|
|
43 |
| **Model size (MB)** |25|44.6|
|
44 |
|
|
|
50 |
'Intel/albert-base-v2-sst2-int8-static',
|
51 |
)
|
52 |
```
|
|
|
|
|
|
|
|