Feature Extraction
sentence-transformers
ONNX
English
bert
sentence-similarity
Inference Endpoints
shuttie commited on
Commit
594b7e0
1 Parent(s): 77b6ac6

add quantize_config

Browse files
Files changed (1) hide show
  1. quantize_config.json +30 -0
quantize_config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "per_channel": true,
3
+ "reduce_range": true,
4
+ "per_model_config": {
5
+ "model": {
6
+ "op_types": [
7
+ "Reshape",
8
+ "Unsqueeze",
9
+ "MatMul",
10
+ "Sqrt",
11
+ "Transpose",
12
+ "Erf",
13
+ "Concat",
14
+ "Div",
15
+ "Gather",
16
+ "ReduceMean",
17
+ "Pow",
18
+ "Softmax",
19
+ "Sub",
20
+ "Add",
21
+ "Constant",
22
+ "Mul",
23
+ "Slice",
24
+ "Cast",
25
+ "Shape"
26
+ ],
27
+ "weight_type": "QInt8"
28
+ }
29
+ }
30
+ }