Generate decoder_model_merged_fp16.onnx
#5
by
whitphx
HF Staff
- opened
Before automating the v3 migration, I did this conversion for https://huggingface.co/onnx-community/whisper-base/discussions/4 by hand as follows:
Find invalid model files
import os
import onnx
path = "/home/ubuntu/src/models/whisper-base/onnx"
for file in os.listdir(path):
try:
onnx.checker.check_model(os.path.join(path, file), full_check=True)
except onnx.onnx_cpp2py_export.checker.ValidationError:
print("Invalid:", file)
-> Only decoder_model_merged_fp16.onnx
is invalid
Convert the fp16 file
$ mkdir temp-whisper-base
$ mkdir quantized-whisper-base
$ cp /home/ubuntu/src/models/whisper-base/onnx/decoder_model_merged.onnx ./temp-whisper-base/.
$ python -m scripts.quantize \
--input_folder temp-whisper-base \
--output_folder quantized-whisper-base \
--modes fp16
Processing temp-whisper-base/decoder_model_merged.onnx: 0%| | 0/1 [00:00<?, ?it/s/home/ubuntu/src/transformers.js/scripts/float16.py:73: UserWarning: the float32 number 5.960464477539063e-08 will be truncated to 1e-07 | 0/1 [00:00<?, ?it/s]
warnings.warn(
/home/ubuntu/src/transformers.js/scripts/float16.py:92: UserWarning: the float32 number -5.960464477539063e-08 will be truncated to -1e-07
warnings.warn(
/home/ubuntu/src/transformers.js/scripts/float16.py:85: UserWarning: the float32 number -3.4028234663852886e+38 will be truncated to -10000.0
warnings.warn(
- Quantizing to fp16: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:04<00:00, 4.33s/it]
Processing temp-whisper-base/decoder_model_merged.onnx: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:04<00:00, 4.33s/it]
So the base model is the existing decoder_model_merged.onnx
that I don't know whether is slimmed by onnxslim
.
Check if the quantized file is valid
python -c "import onnx; onnx.checker.check_model('quantized-whisper-base/decoder_model_merged_fp16.onnx', full_check=True)"
whitphx
changed pull request title from
Upload folder using huggingface_hub
to Generate decoder_model_merged_fp16.onnx
Xenova
changed pull request status to
merged