Add/update the quantized ONNX model files and README.md for Transformers.js v3
#3
by
whitphx
HF Staff
- opened
Applied Quantizations
β
Based on decoder_with_past_model.onnx
with slimming
β³ q4f16
(added)
β
Based on decoder_model.onnx
with slimming
β³ q4f16
(added)
β
Based on encoder_model.onnx
with slimming
β³ q4f16
(added)
β
Based on decoder_model_merged.onnx
without slimming
β³ fp16
(replaced because it was invalid)
β³ q4f16
(added)