HiroseKoichi
/

Llama-Salad-4x8B

Text Generation

nsfw

Not-For-All-Audiences

text-generation-inference

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

HiroseKoichi commited on May 24

Commit

19a2c6c

•

1 Parent(s): 86ab434

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ Combining the two models has kept both of their strengths but none of their weak
 While role-play was the main focus of this merge, its base capabilities weren't affected at all, so swapping models for other tasks isn't needed unless you require a bigger model. Actually, with the addition of Tess-2.0-Llama-3-8B, I did find a small overall improvement. There isn't any particular reason Llama3-OpenBioLLM-8B is in the merge; I needed a 4th model for the merge, and it seemed like a decent fine-tune. Upon testing Llama3-OpenBioLLM-8B after the fact, I've come to the conclusion that it's actually quite bad, and if I do make a V2, it will removed.
-Unfortunately, I can't compare it with 70B models because they're too slow on my machine, but this is the best sub-70B model I have used so far; I haven't felt the need to regenerate any responses, which hasn't happened with any other model so far. This is my first attempt at any kind of merge, and I want to share what I've learned, but this section is already longer than I wanted, so I've decided to place the rest at the bottom of the page.
 # Quantization Formats
 - **GGUF**: https://huggingface.co/HiroseKoichi/Llama-Salad-4x8B-GGUF

 While role-play was the main focus of this merge, its base capabilities weren't affected at all, so swapping models for other tasks isn't needed unless you require a bigger model. Actually, with the addition of Tess-2.0-Llama-3-8B, I did find a small overall improvement. There isn't any particular reason Llama3-OpenBioLLM-8B is in the merge; I needed a 4th model for the merge, and it seemed like a decent fine-tune. Upon testing Llama3-OpenBioLLM-8B after the fact, I've come to the conclusion that it's actually quite bad, and if I do make a V2, it will removed.
+Unfortunately, I can't compare it with 70B models because they're too slow on my machine, but this is the best sub-70B model I have used so far; I haven't felt the need to regenerate any responses, which hasn't happened with any other model. This is my first attempt at any kind of merge, and I want to share what I've learned, but this section is already longer than I wanted, so I've decided to place the rest at the bottom of the page.
 # Quantization Formats
 - **GGUF**: https://huggingface.co/HiroseKoichi/Llama-Salad-4x8B-GGUF