Does not work /:

by erikpro007 - opened May 25, 2024

Discussion

erikpro007

May 25, 2024

using it in Llama.cpp newest version or LM Studio the model will fail to load.

Cuiunbo

OpenBMB org May 25, 2024

We haven't made an official merge yet.
It's available now on https://github.com/OpenBMB/llama.cpp

tecki-trojans

May 25, 2024

Not working with LM Studio Q4 version

{
  "cause": "(Exit code: -1073740791). Unknown error. Try a different model and/or config.",
  "suggestion": "",
  "data": {
    "memory": {
      "ram_capacity": "63.74 GB",
      "ram_unused": "48.25 GB"
    },
    "gpu": {
      "type": "Nvidia CUDA",
      "vram_recommended_capacity": "8.00 GB",
      "vram_unused": "6.93 GB"
    },
    "os": {
      "platform": "win32",
      "version": "10.0.22631",
      "supports_avx2": true
    },
    "app": {
      "version": "0.2.23",
      "downloadsDir": "C:\\Users\\username\\.cache\\lm-studio\\models"
    },
    "model": {}
  },
  "title": "Error loading model."
}```


![image.png](https://cdn-uploads.huggingface.co/production/uploads/66188da9cfc431c5205269c9/Ys-fYa56EPi2dzGlBYWhg.png)

tc-mb

OpenBMB org May 25, 2024

Not working with LM Studio Q4 version

{
  "cause": "(Exit code: -1073740791). Unknown error. Try a different model and/or config.",
  "suggestion": "",
  "data": {
    "memory": {
      "ram_capacity": "63.74 GB",
      "ram_unused": "48.25 GB"
    },
    "gpu": {
      "type": "Nvidia CUDA",
      "vram_recommended_capacity": "8.00 GB",
      "vram_unused": "6.93 GB"
    },
    "os": {
      "platform": "win32",
      "version": "10.0.22631",
      "supports_avx2": true
    },
    "app": {
      "version": "0.2.23",
      "downloadsDir": "C:\\Users\\username\\.cache\\lm-studio\\models"
    },
    "model": {}
  },
  "title": "Error loading model."
}```


![image.png](https://cdn-uploads.huggingface.co/production/uploads/66188da9cfc431c5205269c9/Ys-fYa56EPi2dzGlBYWhg.png)

Our code has not been merged into the official.
please temporarily through our fork(https://github.com/OpenBMB/llama.cpp) to use minicpmv2.5.

joedong

May 26, 2024

Not working with LM Studio

Cuiunbo

OpenBMB org May 26, 2024

Our code has not been merged into the official.
please temporarily through our fork(https://github.com/OpenBMB/llama.cpp) to use minicpmv2.5.
@joedong

ali0une

May 29, 2024

•

edited May 29, 2024

Yes it does :

git clone -b minicpm-v2.5 https://github.com/OpenBMB/llama.cpp.git llama.cpp-minicpm
cd llama.cpp-minicpm/
export LLAMA_CUDA=1 # if you have a NViDiA GPU
make minicpmv-cli -j$(nproc)

if you read https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv you can for example :

run f16 version
./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/model/model-8B-F16.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"

run quantized int4 version
./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"

or run in interactive mode
./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -i

i tried with the Q4_K_M set temp to 0.1 and it worked perfect :
~/whatever/llama.cpp-minicpm/minicpmv-cli -m ~/whatever/ggml-model-Q4_K_M.gguf --mmproj ~/whatever/mmproj-model-f16.gguf -c 4096 --temp 0.1 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image /path/to/image.png -p "describe image"

tecki-trojans

May 30, 2024

Today the latest version of LM Studio got released 0.2.24.. Still facing the same issue. I was hoping the new llama.cpp commit in the release notes would have resolved the issue. LM Studio is very popular tool. Please see if you can make it work ASAP. Thanks in advance.

{
  "cause": "(Exit code: 0). Some model operation failed. Try a different model and/or config.",
  "suggestion": "",
  "data": {
    "memory": {
      "ram_capacity": "63.74 GB",
      "ram_unused": "44.32 GB"
    },
    "gpu": {
      "gpu_names": [
        "NVIDIA GeForce RTX 3080 Laptop GPU"
      ],
      "vram_recommended_capacity": "8.00 GB",
      "vram_unused": "6.93 GB"
    },
    "os": {
      "platform": "win32",
      "version": "10.0.22631",
      "supports_avx2": true
    },
    "app": {
      "version": "0.2.24",
      "downloadsDir": "C:\\Users\\username\\.cache\\lm-studio\\models"
    },
    "model": {}
  },
  "title": "Error loading model."
}```

hawa

May 31, 2024

The problem is that OpenBMB forked both llamacpp and also ollama and in their fork it might work. However in the main versions of those programs there is no support as of now. Maybe the authors of the forks should try to create pull requests in llamacpp and ollama instead of creating their own forks in order to make it more widely available.

baohuynhbk14

Jun 7, 2024

Did you run bf16.gguf? I got error!

Yes it does :
git clone -b minicpm-v2.5 https://github.com/OpenBMB/llama.cpp.git llama.cpp-minicpm
cd llama.cpp-minicpm/
export LLAMA_CUDA=1 # if you have a NViDiA GPU
make minicpmv-cli -j$(nproc)
if you read https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv you can for example :

run f16 version
./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/model/model-8B-F16.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"

run quantized int4 version
./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"

or run in interactive mode
./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -i

i tried with the Q4_K_M set temp to 0.1 and it worked perfect :
~/whatever/llama.cpp-minicpm/minicpmv-cli -m ~/whatever/ggml-model-Q4_K_M.gguf --mmproj ~/whatever/mmproj-model-f16.gguf -c 4096 --temp 0.1 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image /path/to/image.png -p "describe image"

ali0une

Jun 7, 2024

No, only the mmproj-model-f16.gguf

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment