SamLowe
/

roberta-base-go_emotions-onnx

Text Classification

multi-class-classification

multi-label-classification

Model card Files Files and versions Community

SamLowe commited on Sep 28, 2023

Commit

5b21a8d

•

1 Parent(s): 5dde3eb

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ This model is the ONNX version of [https://huggingface.co/SamLowe/roberta-base-g
 - that has identical accuracy/metrics to the original Transformers model
 - and has the same model size (499MB)
-- is faster than inference than normal Transformers, particularly for smaller batch sizes
   - in my tests about 2x to 3x as fast for a batch size of 1 on a 8 core 11th gen i7 CPU using ONNXRuntime
 ### Quaantized (INT8) ONNX version
@@ -33,7 +33,7 @@ This model is the ONNX version of [https://huggingface.co/SamLowe/roberta-base-g
 - that is one quarter the size (125MB) of the full precision model (above)
 - but delivers almost all of the accuracy
-- is faster for inference
   - about 2x as fast for a batch size of 1 on an 8 core 11th gen i7 CPU using ONNXRuntime vs the full precision model above
   - which makes it circa 5x as fast as the full precision normal Transformers model (on the above mentioned CPU, for a batch of 1)

 - that has identical accuracy/metrics to the original Transformers model
 - and has the same model size (499MB)
+- is faster in inference than normal Transformers, particularly for smaller batch sizes
   - in my tests about 2x to 3x as fast for a batch size of 1 on a 8 core 11th gen i7 CPU using ONNXRuntime
 ### Quaantized (INT8) ONNX version
 - that is one quarter the size (125MB) of the full precision model (above)
 - but delivers almost all of the accuracy
+- is faster in inference than both the full precision ONNX above, and the normal Transformers model
   - about 2x as fast for a batch size of 1 on an 8 core 11th gen i7 CPU using ONNXRuntime vs the full precision model above
   - which makes it circa 5x as fast as the full precision normal Transformers model (on the above mentioned CPU, for a batch of 1)