HiroseKoichi
/

Llama-Salad-4x8B

Text Generation

nsfw

Not-For-All-Audiences

text-generation-inference

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

HiroseKoichi commited on May 23

Commit

d5f0e65

•

1 Parent(s): b51d891

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -17,6 +17,9 @@ While role-play was the main focus of this merge, its base capabilities weren't
 Unfortunately, I can't compare it with 70B models because they're too slow on my machine, but this is the best sub-70B model I have used so far; I haven't felt the need to regenerate any responses, which hasn't happened with any other model so far. This is my first attempt at any kind of merge, and I want to share what I've learned, but this section is already longer than I wanted, so I've decided to place the rest at the bottom of the page.
 # Details
 - **License**: [llama3](https://llama.meta.com/llama3/license/)
 - **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)

 Unfortunately, I can't compare it with 70B models because they're too slow on my machine, but this is the best sub-70B model I have used so far; I haven't felt the need to regenerate any responses, which hasn't happened with any other model so far. This is my first attempt at any kind of merge, and I want to share what I've learned, but this section is already longer than I wanted, so I've decided to place the rest at the bottom of the page.
+# Quantization Formats
+- **GGUF**: https://huggingface.co/HiroseKoichi/Llama-Salad-4x8B-GGUF
 # Details
 - **License**: [llama3](https://llama.meta.com/llama3/license/)
 - **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)