Kooten
/

Aurora-Nights-70B-v1.0-IQ2-GGUF

Inference Endpoints

Model card Files Files and versions Community

Kooten commited on Jan 19

Commit

93fc079

•

1 Parent(s): c2a6351

Update README.md

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -11,9 +11,13 @@ IQ2-GGUF quants of [sophosympatheia/Aurora-Nights-70B-v1.0](https://huggingface.
 Unlike regular GGUF quants this uses important matrix similar to Quip# to keep the quant from degrading too much even at 2bpw allowing you to run larger models on less powerful machines.
-***NOTE:*** As of uploading these this llamacpp can run these quants but i am unsure what guis like oobabooga / koboldcpp can run them.
-[More info](https://github.com/ggerganov/llama.cpp/pull/4897)
 # Models

 Unlike regular GGUF quants this uses important matrix similar to Quip# to keep the quant from degrading too much even at 2bpw allowing you to run larger models on less powerful machines.
+***NOTE:*** Currently you will need experimental branches of Koboldcpp or Ooba for this to work.
+- Nexesenex have compiled Windows binaries [HERE](https://github.com/Nexesenex/kobold.cpp/releases/tag/v1.55.1_b1842)
+- [llamacpp_0.2.29 branch](https://github.com/oobabooga/text-generation-webui/tree/llamacpp_0.2.29) of Ooba also works
+[More info about IQ2](https://github.com/ggerganov/llama.cpp/pull/4897)
 # Models