ternary-weight-embedding

基于xiaobu-embedding-v2[1],在nli-zh[2]和t2ranking[3]数据集文本上微调得到的三元权重text embedding模型。模型中所有Linear层的权重取值为1,0或-1。模型中所有Linear层的权重取值为1,0或-1。推理时间和存储空间可以达到全精度模型的0.37x(在A800上的测试结果)的和0.13x。

使用请安装BITBLAS,支持的GPU见[4]

pip install bitblas

初次运行可能会花一些时间

使用Sentence-Transformers进行测试

pip install -U sentence-transformers
import mteb
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('malenia1/ternary-weight-embedding',trust_remote_code=True)
print(model)
tasks = mteb.get_tasks("OnlineShopping")
evaluation = mteb.MTEB(tasks=tasks)
results = evaluation.run(model, output_folder=f"results")

Reference

  1. https://huggingface.co/lier007/xiaobu-embedding-v2
  2. https://huggingface.co/datasets/shibing624/nli-zh-all/tree/main/sampled_data
  3. https://huggingface.co/datasets/sentence-transformers/t2ranking/tree/main/triplet
  4. https://github.com/microsoft/BitBLAS
Downloads last month
553
Safetensors
Model size
98.7M params
Tensor type
F32
·
FP16
·
I8
·
Inference API
Unable to determine this model's library. Check the docs .

Evaluation results