Instructions to use AXERA-TECH/bge-m3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use AXERA-TECH/bge-m3 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("AXERA-TECH/bge-m3") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
Configuration Parsing Warning:Invalid JSON for config file config.json
bge-m3
This version of beg-m3 model has been converted to run on the Axera NPU using w8a16 quantization.
This model has been optimized with the following LoRA:
Compatible with Pulsar2 version: 6.0-dirty
Convert tools links:
For those who are interested in model conversion, you can try to export axmodel through
beg-m3, original model repository
Model Convert, which you can get the detail of guide
Support Platform
- AX650
- AX650N DEMO Board
- M4N-Dock(爱芯派Pro)
- AI Pyramid
- M.2 Accelerator card
| Chips | model | cost | cmm size |
|---|---|---|---|
| AX650 | bge-m3_u16_npu3 | 188.7 ms | 847 MiBytes |
How to use
Download all files from this repository to the device
(py312) root@ax650:~/beg-m3# tree
.
|-- README.md
|-- model
| |-- bge-m3_full_b1_l512.onnx
| |-- bge-m3_full_b1_l512.onnx.data
| `-- bge-m3_u16_npu3.axmodel
|-- python
| |-- axmodel_infer.py
| `-- onnx_infer.py
|-- quant
| |-- bge-m3.json
| `-- calib_tokens_m3.tar.gz
`-- requirements.txt
Inference
Inference with AX650 Host, such as M4N-Dock(爱芯派Pro)
run with python3 axmodel_infer.py
root@ax650:~/bge# python3 axmodel_infer_bgem3.py
[INFO] Available providers: ['AxEngineExecutionProvider', 'AXCLRTExecutionProvider']
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Chip type: ChipType.MC50
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Engine version: 2.12.0s
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 6.0-dirty 71f24c74-dirty
[[0.60657114 0.3339174 ]
[0.34422207 0.66171443]]
0.76533616
0.4494059
{'colbert': [0.7653361558914185, 0.4494059085845947, 0.4402206242084503, 0.7766788601875305], 'sparse': [0.18121449201226758, 0.007629240866828535, 0.0, 0.17698916647350543], 'dense': [0.6065711975097656, 0.33391737937927246, 0.3442220687866211, 0.661714494228363], 'sparse+dense': [0.4647856290105996, 0.22515466654179112, 0.22948137919108072, 0.5001393849767437], 'colbert+sparse+dense': [0.5850058397629272, 0.3148551633589126, 0.3137770771980286, 0.6107551750610585]}
- Downloads last month
- 28