Add ONNX and ORT models with quantization
Browse files- .gitattributes +4 -0
- README.md +85 -0
- README_ja.md +85 -0
- onnx_models/model.onnx +3 -0
- onnx_models/model_fp16.onnx +3 -0
- onnx_models/model_int8.onnx +3 -0
- onnx_models/model_opt.onnx +3 -0
- onnx_models/model_uint8.onnx +3 -0
- ort_models/model.ort +3 -0
- ort_models/model_fp16.ort +3 -0
- ort_models/model_int8.ort +3 -0
- ort_models/model_uint8.ort +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
ort_models/model.ort filter=lfs diff=lfs merge=lfs -text
|
37 |
+
ort_models/model_fp16.ort filter=lfs diff=lfs merge=lfs -text
|
38 |
+
ort_models/model_int8.ort filter=lfs diff=lfs merge=lfs -text
|
39 |
+
ort_models/model_uint8.ort filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
tags:
|
4 |
+
- onnx
|
5 |
+
- ort
|
6 |
+
---
|
7 |
+
|
8 |
+
# ONNX and ORT models with quantization of [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5)
|
9 |
+
|
10 |
+
[日本語READMEはこちら](README_ja.md)
|
11 |
+
|
12 |
+
This repository contains the ONNX and ORT formats of the model [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5), along with quantized versions.
|
13 |
+
|
14 |
+
## License
|
15 |
+
The license for this model is "mit". For details, please refer to the original model page: [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5).
|
16 |
+
|
17 |
+
## Usage
|
18 |
+
To use this model, install ONNX Runtime and perform inference as shown below.
|
19 |
+
```python
|
20 |
+
# Example code
|
21 |
+
import onnxruntime as ort
|
22 |
+
import numpy as np
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
import os
|
25 |
+
|
26 |
+
# Load the tokenizer
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained('answerdotai/JaColBERTv2.5')
|
28 |
+
|
29 |
+
# Prepare inputs
|
30 |
+
text = 'Replace this text with your input.'
|
31 |
+
inputs = tokenizer(text, return_tensors='np')
|
32 |
+
|
33 |
+
# Specify the model paths
|
34 |
+
# Test both the ONNX model and the ORT model
|
35 |
+
model_paths = [
|
36 |
+
'onnx_models/model_opt.onnx', # ONNX model
|
37 |
+
'ort_models/model.ort' # ORT format model
|
38 |
+
]
|
39 |
+
|
40 |
+
# Run inference with each model
|
41 |
+
for model_path in model_paths:
|
42 |
+
print(f'\n===== Using model: {model_path} =====')
|
43 |
+
# Get the model extension
|
44 |
+
model_extension = os.path.splitext(model_path)[1]
|
45 |
+
|
46 |
+
# Load the model
|
47 |
+
if model_extension == '.ort':
|
48 |
+
# Load the ORT format model
|
49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
50 |
+
else:
|
51 |
+
# Load the ONNX model
|
52 |
+
session = ort.InferenceSession(model_path)
|
53 |
+
|
54 |
+
# Run inference
|
55 |
+
outputs = session.run(None, dict(inputs))
|
56 |
+
|
57 |
+
# Display the output shapes
|
58 |
+
for idx, output in enumerate(outputs):
|
59 |
+
print(f'Output {idx} shape: {output.shape}')
|
60 |
+
|
61 |
+
# Display the results (add further processing if needed)
|
62 |
+
print(outputs)
|
63 |
+
```
|
64 |
+
|
65 |
+
## Contents of the Model
|
66 |
+
This repository includes the following models:
|
67 |
+
|
68 |
+
### ONNX Models
|
69 |
+
- `onnx_models/model.onnx`: Original ONNX model converted from [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5)
|
70 |
+
- `onnx_models/model_opt.onnx`: Optimized ONNX model
|
71 |
+
- `onnx_models/model_fp16.onnx`: FP16 quantized model
|
72 |
+
- `onnx_models/model_int8.onnx`: INT8 quantized model
|
73 |
+
- `onnx_models/model_uint8.onnx`: UINT8 quantized model
|
74 |
+
|
75 |
+
### ORT Models
|
76 |
+
- `ort_models/model.ort`: ORT model using the optimized ONNX model
|
77 |
+
- `ort_models/model_fp16.ort`: ORT model using the FP16 quantized model
|
78 |
+
- `ort_models/model_int8.ort`: ORT model using the INT8 quantized model
|
79 |
+
- `ort_models/model_uint8.ort`: ORT model using the UINT8 quantized model
|
80 |
+
|
81 |
+
## Notes
|
82 |
+
Please adhere to the license and usage conditions of the original model [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5).
|
83 |
+
|
84 |
+
## Contribution
|
85 |
+
If you find any issues or have improvements, please create an issue or submit a pull request.
|
README_ja.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
tags:
|
4 |
+
- onnx
|
5 |
+
- ort
|
6 |
+
---
|
7 |
+
|
8 |
+
# [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5) のONNXおよびORTモデルと量子化モデル
|
9 |
+
|
10 |
+
[Click here for the English README](README.md)
|
11 |
+
|
12 |
+
このリポジトリは、元のモデル [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5) をONNXおよびORT形式に変換し、さらに量子化したものです。
|
13 |
+
|
14 |
+
## ライセンス
|
15 |
+
このモデルのライセンスは「mit」です。詳細は元のモデルページ([answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5))を参照してください。
|
16 |
+
|
17 |
+
## 使い方
|
18 |
+
このモデルを使用するには、ONNX Runtimeをインストールし、以下のように推論を行います。
|
19 |
+
```python
|
20 |
+
# サンプルコード
|
21 |
+
import onnxruntime as ort
|
22 |
+
import numpy as np
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
import os
|
25 |
+
|
26 |
+
# トークナイザーの読み込み
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained('answerdotai/JaColBERTv2.5')
|
28 |
+
|
29 |
+
# 入力の準備
|
30 |
+
text = 'ここに入力テキストを置き換えてください。'
|
31 |
+
inputs = tokenizer(text, return_tensors='np')
|
32 |
+
|
33 |
+
# 使用するモデルのパスを指定
|
34 |
+
# ONNXモデルとORTモデルの両方をテストする
|
35 |
+
model_paths = [
|
36 |
+
'onnx_models/model_opt.onnx', # ONNXモデル
|
37 |
+
'ort_models/model.ort' # ORTフォーマットのモデル
|
38 |
+
]
|
39 |
+
|
40 |
+
# モデルごとに推論を実行
|
41 |
+
for model_path in model_paths:
|
42 |
+
print(f'\n===== Using model: {model_path} =====')
|
43 |
+
# モデルの拡張子を取得
|
44 |
+
model_extension = os.path.splitext(model_path)[1]
|
45 |
+
|
46 |
+
# モデルの読み込み
|
47 |
+
if model_extension == '.ort':
|
48 |
+
# ORTフォーマットのモデルをロード
|
49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
50 |
+
else:
|
51 |
+
# ONNXモデルをロード
|
52 |
+
session = ort.InferenceSession(model_path)
|
53 |
+
|
54 |
+
# 推論の実行
|
55 |
+
outputs = session.run(None, dict(inputs))
|
56 |
+
|
57 |
+
# 出力の形状を表示
|
58 |
+
for idx, output in enumerate(outputs):
|
59 |
+
print(f'Output {idx} shape: {output.shape}')
|
60 |
+
|
61 |
+
# 結果の表示(必要に応じて処理を追加)
|
62 |
+
print(outputs)
|
63 |
+
```
|
64 |
+
|
65 |
+
## モデルの内容
|
66 |
+
このリポジトリには、以下のモデルが含まれています。
|
67 |
+
|
68 |
+
### ONNXモデル
|
69 |
+
- `onnx_models/model.onnx`: [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5) から変換された元のONNXモデル
|
70 |
+
- `onnx_models/model_opt.onnx`: 最適化されたONNXモデル
|
71 |
+
- `onnx_models/model_fp16.onnx`: FP16による量子化モデル
|
72 |
+
- `onnx_models/model_int8.onnx`: INT8による量子化モデル
|
73 |
+
- `onnx_models/model_uint8.onnx`: UINT8による量子化モデル
|
74 |
+
|
75 |
+
### ORTモデル
|
76 |
+
- `ort_models/model.ort`: 最適化されたONNXモデルを使用したORTモデル
|
77 |
+
- `ort_models/model_fp16.ort`: FP16量子化モデルを使用したORTモデル
|
78 |
+
- `ort_models/model_int8.ort`: INT8量子化モデルを使用したORTモデル
|
79 |
+
- `ort_models/model_uint8.ort`: UINT8量子化モデルを使用したORTモデル
|
80 |
+
|
81 |
+
## 注意事項
|
82 |
+
元のモデル [answerdotai/JaColBERTv2.5](https://huggingface.co/answerdotai/JaColBERTv2.5) のライセンスおよび使用条件を遵守してください。
|
83 |
+
|
84 |
+
## 貢献
|
85 |
+
問題や改善点があれば、Issueを作成するかプルリクエストを送ってください。
|
onnx_models/model.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:768d40d897995a96fead4e1b940aac9e143930586cce41c65021a21b528dd693
|
3 |
+
size 445045956
|
onnx_models/model_fp16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7065b4478cc5f7a984d7aa0a38ddc6f61d3f1cb78024ad03d2cb4ab4b88f3e80
|
3 |
+
size 222667519
|
onnx_models/model_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2d2eb74e075dd6f8b7da587515680312fbacdf7cb3511d098101f7581f5b5e3d
|
3 |
+
size 111918790
|
onnx_models/model_opt.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:06a615ecc763066615f4063fb03bd34f196d2cfe77422eb5c20c4f40e269cb11
|
3 |
+
size 445020073
|
onnx_models/model_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4bfd183604101b977b1fb02da0e04c7ef19ec476a570f4d00185ebe20bc5a87a
|
3 |
+
size 111918824
|
ort_models/model.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f97ee1ae391ece6a3d11437691145ba5193513a3f43405fbfef7255985f1f179
|
3 |
+
size 445214680
|
ort_models/model_fp16.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:89acdd8c8e6b1e4283690fb4086a60404bfa012a9e4f4effd90d90aca70f1bde
|
3 |
+
size 223216064
|
ort_models/model_int8.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:930b4b3d66be6926479cc35a561f7961e728bcb24f1cb24674e27f9ee1aebc39
|
3 |
+
size 112083680
|
ort_models/model_uint8.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8e0f69f2e36dd754e9f1e7ab502877ef419c4b996104f1f27736074ef716a81a
|
3 |
+
size 112083680
|