nullt3r
/

Meta-Llama-3-8B-Instruct-64k-PoSE-Q8_0-GGUF

Text Generation

Model card Files Files and versions Community

nullt3r commited on Apr 25

Commit

b5ca81f

•

1 Parent(s): 44bd237

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ pipeline_tag: text-generation
 ---
 # nullt3r/Meta-Llama-3-8B-Instruct-64k-PoSE-Q8_0-GGUF
-**This is 64k context size model (by Wing Lian) and performs well when used with LM Studio and the standard LLaMA 3 profile. However, I've found there is an issue in ollama, where it generates tokens continuously and never stops.**
 This model was converted to GGUF format from [`Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE`](https://huggingface.co/Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE) for more details on the model.

 ---
 # nullt3r/Meta-Llama-3-8B-Instruct-64k-PoSE-Q8_0-GGUF
+This model uses PoSE to extend Llama's context length from 8k to 64k (https://huggingface.co/winglian/Llama-3-8b-64k-PoSE). It performs exceptionally well when used with LM Studio and the standard LLaMA 3 profile. However, there is a notable issue with ollama—it continuously generates tokens without stopping.
 This model was converted to GGUF format from [`Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE`](https://huggingface.co/Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/Azma-AI/Meta-Llama-3-8B-Instruct-64k-PoSE) for more details on the model.