你好,请问如何使用这个模型?
#2
by
Luoaho
- opened
请问如何启动 bge-m3-q8_0.gguf 模型获得文本的向量?
https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md
你好
我使用 server.exe -m D:/HuggingFace/vonjack-bge-m3-gguf/bge-m3-f16.gguf --embedding -c 8192 --host 0.0.0.0 --port 8000
命令启动模型
测试代码
import requests
from sentence_transformers import util as st_util
import numpy as np
def get_embeding(text):
url = "http://127.0.0.1:8000/v1/embeddings"
payload = {
"input": text,
"model": "GPT-4",
"encoding_format": "float"
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer no-key",
"content-type": "application/json"
}
response = requests.request("POST", url, json=payload, headers=headers)
return response.json()
if __name__ == "__main__":
a = get_embeding("中国")
a_embedding = a["data"][0]["embedding"]
print(len(a_embedding))
b = get_embeding("中华人民共和国")
b_embedding = b["data"][0]["embedding"]
temp = st_util.cos_sim(np.array(a_embedding), np.array(b_embedding))
print(temp)
print(np.array(a_embedding) @ np.array(b_embedding))
最后得到的结果是
1024
tensor([[0.5069]], dtype=torch.float64)
0.5069107368813353
得到的相似度是0.5069,和您得到的0.999有很大差异,请问您可以帮忙看看吗?万分感谢!!
我使用的llamacpp 版本是:llamacpp-b2430-bin-win-avx2-x64