brucethemoose commited on
Commit
7ea62e3
1 Parent(s): f74565f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -3
README.md CHANGED
@@ -30,9 +30,7 @@ Being a Yi model, run a lower temperature with 0.05 or higher MinP, a little rep
30
 
31
  24GB GPUs can efficiently run Yi-34B-200K models at **40K-90K context** with exllamav2, and performant UIs like [exui](https://github.com/turboderp/exui). I go into more detail in this [post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/). 16GB GPUs can still run the high context with aggressive quantization.
32
 
33
- I recommend exl2 quantizations profiled on data similar to the desired task. It is especially sensitive to the quantization data at low bpw. I've upload my own fiction-oriented quantizations here: https://huggingface.co/collections/brucethemoose/most-recent-merge-65742644ca03b6c514afa204
34
-
35
- Lonestriker has also uploaded more general purpose quantizations here: https://huggingface.co/models?sort=trending&search=LoneStriker+Yi-34B-200K-DARE-megamerge-v8
36
 
37
  Additionally, TheBloke has uploaded experimental GGUFs using llama.cpp's new imatrix quantization feature, profiled on VMware open-instruct: https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF
38
 
 
30
 
31
  24GB GPUs can efficiently run Yi-34B-200K models at **40K-90K context** with exllamav2, and performant UIs like [exui](https://github.com/turboderp/exui). I go into more detail in this [post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/). 16GB GPUs can still run the high context with aggressive quantization.
32
 
33
+ Lonestriker has also uploaded general purpose quantizations here: https://huggingface.co/models?sort=trending&search=LoneStriker+Yi-34B-200K-DARE-megamerge-v8
 
 
34
 
35
  Additionally, TheBloke has uploaded experimental GGUFs using llama.cpp's new imatrix quantization feature, profiled on VMware open-instruct: https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF
36