Configuration Parsing Warning:Invalid JSON for config file config.json

bge-m3

This version of beg-m3 model has been converted to run on the Axera NPU using w8a16 quantization.

This model has been optimized with the following LoRA:

Compatible with Pulsar2 version: 6.0-dirty

Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through

Support Platform

Chips model cost cmm size
AX650 bge-m3_u16_npu3 188.7 ms 847 MiBytes

How to use

Download all files from this repository to the device


(py312) root@ax650:~/beg-m3# tree
.
|-- README.md
|-- model
|   |-- bge-m3_full_b1_l512.onnx
|   |-- bge-m3_full_b1_l512.onnx.data
|   `-- bge-m3_u16_npu3.axmodel
|-- python
|   |-- axmodel_infer.py
|   `-- onnx_infer.py
|-- quant
|   |-- bge-m3.json
|   `-- calib_tokens_m3.tar.gz
`-- requirements.txt

Inference

Inference with AX650 Host, such as M4N-Dock(爱芯派Pro)

run with python3 axmodel_infer.py

root@ax650:~/bge# python3 axmodel_infer_bgem3.py
[INFO] Available providers:  ['AxEngineExecutionProvider', 'AXCLRTExecutionProvider']
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Chip type: ChipType.MC50
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Engine version: 2.12.0s
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 6.0-dirty 71f24c74-dirty
[[0.60657114 0.3339174 ]
 [0.34422207 0.66171443]]
0.76533616
0.4494059
{'colbert': [0.7653361558914185, 0.4494059085845947, 0.4402206242084503, 0.7766788601875305], 'sparse': [0.18121449201226758, 0.007629240866828535, 0.0, 0.17698916647350543], 'dense': [0.6065711975097656, 0.33391737937927246, 0.3442220687866211, 0.661714494228363], 'sparse+dense': [0.4647856290105996, 0.22515466654179112, 0.22948137919108072, 0.5001393849767437], 'colbert+sparse+dense': [0.5850058397629272, 0.3148551633589126, 0.3137770771980286, 0.6107551750610585]}

Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support