SamLowe commited on
Commit
5b21a8d
1 Parent(s): 5dde3eb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -24,7 +24,7 @@ This model is the ONNX version of [https://huggingface.co/SamLowe/roberta-base-g
24
 
25
  - that has identical accuracy/metrics to the original Transformers model
26
  - and has the same model size (499MB)
27
- - is faster than inference than normal Transformers, particularly for smaller batch sizes
28
  - in my tests about 2x to 3x as fast for a batch size of 1 on a 8 core 11th gen i7 CPU using ONNXRuntime
29
 
30
  ### Quaantized (INT8) ONNX version
@@ -33,7 +33,7 @@ This model is the ONNX version of [https://huggingface.co/SamLowe/roberta-base-g
33
 
34
  - that is one quarter the size (125MB) of the full precision model (above)
35
  - but delivers almost all of the accuracy
36
- - is faster for inference
37
  - about 2x as fast for a batch size of 1 on an 8 core 11th gen i7 CPU using ONNXRuntime vs the full precision model above
38
  - which makes it circa 5x as fast as the full precision normal Transformers model (on the above mentioned CPU, for a batch of 1)
39
 
 
24
 
25
  - that has identical accuracy/metrics to the original Transformers model
26
  - and has the same model size (499MB)
27
+ - is faster in inference than normal Transformers, particularly for smaller batch sizes
28
  - in my tests about 2x to 3x as fast for a batch size of 1 on a 8 core 11th gen i7 CPU using ONNXRuntime
29
 
30
  ### Quaantized (INT8) ONNX version
 
33
 
34
  - that is one quarter the size (125MB) of the full precision model (above)
35
  - but delivers almost all of the accuracy
36
+ - is faster in inference than both the full precision ONNX above, and the normal Transformers model
37
  - about 2x as fast for a batch size of 1 on an 8 core 11th gen i7 CPU using ONNXRuntime vs the full precision model above
38
  - which makes it circa 5x as fast as the full precision normal Transformers model (on the above mentioned CPU, for a batch of 1)
39