|
--- |
|
license: cc-by-4.0 |
|
base_model: BAAI/bge-m3 |
|
language: ["vi"] |
|
library_name: sentence-transformers |
|
pipeline_tag: sentence-similarity |
|
inference: false |
|
--- |
|
|
|
# `BAAI/bge-m3` in GGUF format |
|
|
|
original: https://huggingface.co/BAAI/bge-m3 |
|
|
|
quantization: |
|
```bash |
|
REL=b3827 # can change to a later release |
|
wget https://github.com/ggerganov/llama.cpp/releases/download/$REL/llama-$REL-bin-ubuntu-x64.zip --content-disposition --continue &> /dev/null |
|
wget https://github.com/ggerganov/llama.cpp/archive/refs/tags/$REL.zip --content-disposition --continue &> /dev/null |
|
unzip -q llama-$REL-bin-ubuntu-x64.zip |
|
unzip -q llama.cpp-$REL.zip |
|
mv llama.cpp-$REL/* . |
|
rm -r llama.cpp-$REL/ llama-$REL-bin-ubuntu-x64.zip llama.cpp-$REL.zip |
|
pip install -q -r requirements.txt |
|
|
|
rm -rf models/tmp/ |
|
git clone --depth=1 --single-branch https://huggingface.co/BAAI/bge-m3 models/tmp |
|
python convert_hf_to_gguf.py models/tmp/ --outfile model-f32.gguf --outtype f32 |
|
|
|
build/bin/llama-quantize model-f32.gguf model-f16.gguf f16 2> /dev/null |
|
build/bin/llama-quantize model-f32.gguf model-bf16.gguf bf16 2> /dev/null |
|
build/bin/llama-quantize model-f32.gguf model-q8_0.gguf q8_0 2> /dev/null |
|
build/bin/llama-quantize model-f32.gguf model-q6_k.gguf q6_k 2> /dev/null |
|
build/bin/llama-quantize model-f32.gguf model-q5_k_m.gguf q5_k_m 2> /dev/null |
|
build/bin/llama-quantize model-f32.gguf model-q5_k_s.gguf q5_k_s 2> /dev/null |
|
build/bin/llama-quantize model-f32.gguf model-q4_k_m.gguf q4_k_m 2> /dev/null |
|
build/bin/llama-quantize model-f32.gguf model-q4_k_s.gguf q4_k_s 2> /dev/null |
|
|
|
rm -rf models/yolo/ |
|
mkdir -p models/yolo |
|
mv model-*.gguf models/yolo/ |
|
touch models/yolo/README.md |
|
huggingface-cli upload bge-m3-gguf models/yolo . |
|
``` |
|
|
|
usage: |
|
```bash |
|
build/bin/llama-embedding -m model-q5_k_m.gguf -p "Cô ấy cười nói suốt cả ngày" --embd-output-format array 2> /dev/null |
|
# OR |
|
build/bin/llama-server --embedding -c 8192 -m model-q5_k_m.gguf |
|
``` |