HiroseKoichi commited on
Commit
19a2c6c
1 Parent(s): 86ab434

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -15,7 +15,7 @@ Combining the two models has kept both of their strengths but none of their weak
15
 
16
  While role-play was the main focus of this merge, its base capabilities weren't affected at all, so swapping models for other tasks isn't needed unless you require a bigger model. Actually, with the addition of Tess-2.0-Llama-3-8B, I did find a small overall improvement. There isn't any particular reason Llama3-OpenBioLLM-8B is in the merge; I needed a 4th model for the merge, and it seemed like a decent fine-tune. Upon testing Llama3-OpenBioLLM-8B after the fact, I've come to the conclusion that it's actually quite bad, and if I do make a V2, it will removed.
17
 
18
- Unfortunately, I can't compare it with 70B models because they're too slow on my machine, but this is the best sub-70B model I have used so far; I haven't felt the need to regenerate any responses, which hasn't happened with any other model so far. This is my first attempt at any kind of merge, and I want to share what I've learned, but this section is already longer than I wanted, so I've decided to place the rest at the bottom of the page.
19
 
20
  # Quantization Formats
21
  - **GGUF**: https://huggingface.co/HiroseKoichi/Llama-Salad-4x8B-GGUF
 
15
 
16
  While role-play was the main focus of this merge, its base capabilities weren't affected at all, so swapping models for other tasks isn't needed unless you require a bigger model. Actually, with the addition of Tess-2.0-Llama-3-8B, I did find a small overall improvement. There isn't any particular reason Llama3-OpenBioLLM-8B is in the merge; I needed a 4th model for the merge, and it seemed like a decent fine-tune. Upon testing Llama3-OpenBioLLM-8B after the fact, I've come to the conclusion that it's actually quite bad, and if I do make a V2, it will removed.
17
 
18
+ Unfortunately, I can't compare it with 70B models because they're too slow on my machine, but this is the best sub-70B model I have used so far; I haven't felt the need to regenerate any responses, which hasn't happened with any other model. This is my first attempt at any kind of merge, and I want to share what I've learned, but this section is already longer than I wanted, so I've decided to place the rest at the bottom of the page.
19
 
20
  # Quantization Formats
21
  - **GGUF**: https://huggingface.co/HiroseKoichi/Llama-Salad-4x8B-GGUF