nullt3r commited on
Commit
b5ca81f
1 Parent(s): 44bd237

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -13,7 +13,7 @@ pipeline_tag: text-generation
13
  ---
14
 
15
  # nullt3r/Meta-Llama-3-8B-Instruct-64k-PoSE-Q8_0-GGUF
16
- **This is 64k context size model (by Wing Lian) and performs well when used with LM Studio and the standard LLaMA 3 profile. However, I've found there is an issue in ollama, where it generates tokens continuously and never stops.**
17
 
18
  This model was converted to GGUF format from [`Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE`](https://huggingface.co/Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE) for more details on the model.
 
13
  ---
14
 
15
  # nullt3r/Meta-Llama-3-8B-Instruct-64k-PoSE-Q8_0-GGUF
16
+ This model uses PoSE to extend Llama's context length from 8k to 64k (https://huggingface.co/winglian/Llama-3-8b-64k-PoSE). It performs exceptionally well when used with LM Studio and the standard LLaMA 3 profile. However, there is a notable issue with ollama—it continuously generates tokens without stopping.
17
 
18
  This model was converted to GGUF format from [`Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE`](https://huggingface.co/Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE) for more details on the model.