adamo1139
/

Yi-34B-200K-AEZAKMI-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

adamo1139 commited on Dec 28, 2023

Commit

9536b40

•

1 Parent(s): ea1a3cb

Update README.md

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -23,6 +23,25 @@ I had to lower max_positional_embeddings in config.json and model_max_length for
 My first attempt had max_positional_embeddings set to 16384 and model_max_length set to 200000. This allowed fine-tuning to finish, but that model was broken after applying LoRA and merging it. \
 This attempt had both max_position_embeddings and model_max_length set to 4096, which worked perfectly fine.
 ## Prompt Format
 I recommend using ChatML format, as this was used during fine-tune. \

 My first attempt had max_positional_embeddings set to 16384 and model_max_length set to 200000. This allowed fine-tuning to finish, but that model was broken after applying LoRA and merging it. \
 This attempt had both max_position_embeddings and model_max_length set to 4096, which worked perfectly fine.
+## Quants!
+Huge thank you to LoneStriker and TheBloke for providing quantized versions.
+EXL2 \
+3bpw - https://huggingface.co/LoneStriker/Yi-34B-200K-AEZAKMI-v2-3.0bpw-h6-exl2 \
+4bpw - https://huggingface.co/LoneStriker/Yi-34B-200K-AEZAKMI-v2-4.0bpw-h6-exl2 \
+4.65bpw - https://huggingface.co/LoneStriker/Yi-34B-200K-AEZAKMI-v2-4.65bpw-h6-exl2 \
+5bpw - https://huggingface.co/LoneStriker/Yi-34B-200K-AEZAKMI-v2-5.0bpw-h6-exl2 \
+6bpw - https://huggingface.co/LoneStriker/Yi-34B-200K-AEZAKMI-v2-6.0bpw-h6-exl2 \
+8bpw - https://huggingface.co/LoneStriker/Yi-34B-200K-AEZAKMI-v2-8.0bpw-h8-exl2
+GGUF - https://huggingface.co/TheBloke/Yi-34B-200K-AEZAKMI-v2-GGUF
+GPTQ - https://huggingface.co/TheBloke/Yi-34B-200K-AEZAKMI-v2-GPTQ
+AWQ - https://huggingface.co/TheBloke/Yi-34B-200K-AEZAKMI-v2-AWQ
 ## Prompt Format
 I recommend using ChatML format, as this was used during fine-tune. \