Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### β
Based on `decoder_with_past_model.onnx` *with* slimming
β³ β
`q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
### β
Based on `decoder_model.onnx` *with* slimming
β³ β
`q4f16`: `decoder_model_q4f16.onnx` (added)
### β
Based on `encoder_model.onnx` *with* slimming
β³ β
`q4f16`: `encoder_model_q4f16.onnx` (added)
### β Based on `decoder_model_merged.onnx` *with* slimming
```
0%| | 0/1 [00:00<?, ?it/s]
Processing /var/folders/0t/802mlc4s6bdcbjp2lt8x9v_h0000gn/T/tmp0mhfuqak/decoder_model_merged.onnx: 0%| | 0/1 [00:00<?, ?it/s]
0%| | 0/2 [00:00<?, ?it/s][A
- Quantizing to fp16: 0%| | 0/2 [00:00<?, ?it/s][A/Users/whitphx/src/tjsmigration/transformers.js/scripts/float16.py:73: UserWarning: the float32 number 5.960464477539063e-08 will be truncated to 1e-07
warnings.warn(
/Users/whitphx/src/tjsmigration/transformers.js/scripts/float16.py:92: UserWarning: the float32 number -5.960464477539063e-08 will be truncated to -1e-07
warnings.warn(
/Users/whitphx/src/tjsmigration/transformers.js/scripts/float16.py:85: UserWarning: the float32 number -3.4028234663852886e+38 will be truncated to -10000.0
warnings.warn(
- Quantizing to fp16: 0%| | 0/2 [00:00<?, ?it/s]
Processing /var/folders/0t/802mlc4s6bdcbjp2lt8x9v_h0000gn/T/tmp0mhfuqak/decoder_model_merged.onnx: 0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/whitphx/src/tjsmigration/transformers.js/scripts/quantize.py", line 377, in <module>
main()
File "/Users/whitphx/src/tjsmigration/transformers.js/scripts/quantize.py", line 374, in main
quantize(input_folder, output_folder, quantization_args)
File "/Users/whitphx/src/tjsmigration/transformers.js/scripts/quantize.py", line 309, in quantize
quantize_fp16(
File "/Users/whitphx/src/tjsmigration/transformers.js/scripts/quantize.py", line 223, in quantize_fp16
check_and_save_model(model_fp16, save_path)
File "/Users/whitphx/src/tjsmigration/transformers.js/scripts/utils.py", line 29, in check_and_save_model
strict_check_model(model)
File "/Users/whitphx/src/tjsmigration/transformers.js/scripts/utils.py", line 21, in strict_check_model
raise e
File "/Users/whitphx/src/tjsmigration/transformers.js/scripts/utils.py", line 16, in strict_check_model
onnx.checker.check_model(model_or_path, full_check=True)
File "/Users/whitphx/.cache/uv/archive-v0/IJsPiE4p57ikf3MwkZL1A/lib/python3.12/site-packages/onnx/checker.py", line 179, in check_model
C.check_model(
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] Inference error(s): (op_type:If, node name: optimum::if): [ShapeInferenceError] Inference error(s): (op_type:Add, node name: /model/decoder/embed_positions/Add): [ShapeInferenceError] Inferred shape and existing shape differ in rank: (1) vs (0)
```
### β
Based on `decoder_model_merged.onnx` *without* slimming
β³ β
`fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
β³ β
`q4f16`: `decoder_model_merged_q4f16.onnx` (added)
@@ -5,4 +5,21 @@ library_name: transformers.js
|
|
5 |
|
6 |
https://huggingface.co/openai/whisper-tiny with ONNX weights to be compatible with Transformers.js.
|
7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
5 |
|
6 |
https://huggingface.co/openai/whisper-tiny with ONNX weights to be compatible with Transformers.js.
|
7 |
|
8 |
+
## Usage (Transformers.js)
|
9 |
+
|
10 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
11 |
+
```bash
|
12 |
+
npm i @huggingface/transformers
|
13 |
+
```
|
14 |
+
|
15 |
+
**Example:** Transcribe audio from a URL.
|
16 |
+
|
17 |
+
```js
|
18 |
+
import { pipeline } from '@huggingface/transformers';
|
19 |
+
|
20 |
+
const transcriber = await pipeline('automatic-speech-recognition', 'whitphx/test-transformersjs-whisper-tiny');
|
21 |
+
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav';
|
22 |
+
const output = await transcriber(url);
|
23 |
+
```
|
24 |
+
|
25 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a1aac8aa5f4e9e835e343025edf5e4d1644ab6f629c0f8c21e121868e60b3f6c
|
3 |
+
size 59593896
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:de3673d5f2ec22b93532ca46db8cf779186277f535c911aa4ad2031444aa255d
|
3 |
+
size 46018786
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:da6a0fd70e98bd376114cc4dc7de61aa957a4bc53c6e8b0a3a3ea9a99efd791f
|
3 |
+
size 45714297
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a548eb372c80f22855263153b568e83628eecf9e09a008fd41615f02d4c5daa0
|
3 |
+
size 45045643
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:da3baf48377114c6b81ed0c98a21a25a261b411c2cd02dbb6fefefc777fd96dc
|
3 |
+
size 6301379
|