Walter-Klaus commited on
Commit
499e994
1 Parent(s): 94203ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -13,6 +13,8 @@ tags:
13
  ---
14
 
15
  # Walter-Klaus/Llama-3.1-Minitron-4B-Chat-Q4_K_M-GGUF
 
 
16
  This model was converted to GGUF format from [`rasyosef/Llama-3.1-Minitron-4B-Chat`](https://huggingface.co/rasyosef/Llama-3.1-Minitron-4B-Chat) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
17
  Refer to the [original model card](https://huggingface.co/rasyosef/Llama-3.1-Minitron-4B-Chat) for more details on the model.
18
 
 
13
  ---
14
 
15
  # Walter-Klaus/Llama-3.1-Minitron-4B-Chat-Q4_K_M-GGUF
16
+ Please note: As of now (Sept 4, 2024) it seems that the Minitron modified model architecture is not yet supported by llama-cpp and any tool, which is based on llama.cpp, e.g. Gpt4All, KoboldCpp, LM Studio or llama-cli.
17
+
18
  This model was converted to GGUF format from [`rasyosef/Llama-3.1-Minitron-4B-Chat`](https://huggingface.co/rasyosef/Llama-3.1-Minitron-4B-Chat) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/rasyosef/Llama-3.1-Minitron-4B-Chat) for more details on the model.
20