Update README.md
Browse files
README.md
CHANGED
@@ -12,22 +12,24 @@ tags:
|
|
12 |
|
13 |
A frankenMoE not only using far better methodology and fundamental understanding of SMoE, but completely focused around intellectual roleplay. It may have a bit of a redundancy issue, to battle this, I've implemented models with extremely high scores for variety on Ayumi's benchmark. If you still have this problem, try to keep things fresh with the model by either introducing new concepts often, or through [drμgs](https://github.com/EGjoni/DRUGS). (no not that kind)
|
14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
## Provided files
|
16 |
|
17 |
| Name | Quant method | Bits | Size | Max RAM required | Use case |
|
18 |
| ---- | ---- | ---- | ---- | ---- | ----- |
|
19 |
-
| [
|
20 |
-
| [
|
21 |
-
| [
|
22 |
-
| [
|
23 |
-
| [
|
24 |
-
| [
|
25 |
-
| [
|
26 |
-
| [
|
27 |
-
| [lumosia-moe-4x10.7.Q5_K_S.gguf](https://huggingface.co/TheBloke/Lumosia-MoE-4x10.7-GGUF/blob/main/lumosia-moe-4x10.7.Q5_K_S.gguf) | Q5_K_S | 5 | 24.84 GB| 27.34 GB | large, low quality loss - recommended |
|
28 |
-
| [lumosia-moe-4x10.7.Q5_K_M.gguf](https://huggingface.co/TheBloke/Lumosia-MoE-4x10.7-GGUF/blob/main/lumosia-moe-4x10.7.Q5_K_M.gguf) | Q5_K_M | 5 | 24.85 GB| 27.35 GB | large, very low quality loss - recommended |
|
29 |
-
| [lumosia-moe-4x10.7.Q6_K.gguf](https://huggingface.co/TheBloke/Lumosia-MoE-4x10.7-GGUF/blob/main/lumosia-moe-4x10.7.Q6_K.gguf) | Q6_K | 6 | 29.62 GB| 32.12 GB | very large, extremely low quality loss |
|
30 |
-
| [lumosia-moe-4x10.7.Q8_0.gguf](https://huggingface.co/TheBloke/Lumosia-MoE-4x10.7-GGUF/blob/main/lumosia-moe-4x10.7.Q8_0.gguf) | Q8_0 | 8 | 38.36 GB| 40.86 GB | very large, extremely low quality loss - not recommended |
|
31 |
|
32 |
# "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
|
33 |
### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
|
|
|
12 |
|
13 |
A frankenMoE not only using far better methodology and fundamental understanding of SMoE, but completely focused around intellectual roleplay. It may have a bit of a redundancy issue, to battle this, I've implemented models with extremely high scores for variety on Ayumi's benchmark. If you still have this problem, try to keep things fresh with the model by either introducing new concepts often, or through [drμgs](https://github.com/EGjoni/DRUGS). (no not that kind)
|
14 |
|
15 |
+
The models that were implemented are as follows:
|
16 |
+
- [mlabonne/Beagle14-7B](https://huggingface.co/mlabonne/Beagle14-7B) - base
|
17 |
+
- [fblgit/una-cybertron-7b-v3-OMA](https://huggingface.co/fblgit/una-cybertron-7b-v3-OMA) - expert #1
|
18 |
+
- [rwitz/go-bruins-v2](https://huggingface.co/rwitz/go-bruins-v2) - expert #2
|
19 |
+
- [mlabonne/Beagle14-7B](https://huggingface.co/mlabonne/Beagle14-7B) - expert #3
|
20 |
+
- [mlabonne/Beagle14-7B](https://huggingface.co/mlabonne/Beagle14-7B) - expert #4
|
21 |
## Provided files
|
22 |
|
23 |
| Name | Quant method | Bits | Size | Max RAM required | Use case |
|
24 |
| ---- | ---- | ---- | ---- | ---- | ----- |
|
25 |
+
| [Q2_K Tiny](https://huggingface.co/Kquant03/FrankenDPO-4x7B-GGUF/blob/main/ggml-model-q2_k.gguf) | Q2_K | 2 | 7.87 GB| 9.87 GB | smallest, significant quality loss - not recommended for most purposes |
|
26 |
+
| [Q3_K_M](https://huggingface.co/Kquant03/FrankenDPO-4x7B-GGUF/blob/main/ggml-model-q3_k_m.gguf) | Q3_K_M | 3 | 10.28 GB| 12.28 GB | very small, high quality loss |
|
27 |
+
| [Q4_0](https://huggingface.co/Kquant03/FrankenDPO-4x7B-GGUF/blob/main/ggml-model-q4_0.gguf) | Q4_0 | 4 | 13.3 GB| 15.3 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
|
28 |
+
| [Q4_K_M](https://huggingface.co/Kquant03/FrankenDPO-4x7B-GGUF/blob/main/ggml-model-q4_k_m.gguf) | Q4_K_M | 4 | 13.32 GB| 15.32 GB | medium, balanced quality - recommended |
|
29 |
+
| [Q5_0](https://huggingface.co/Kquant03/FrankenDPO-4x7B-GGUF/blob/main/ggml-model-q5_0.gguf) | Q5_0 | 5 | 16.24 GB| 18.24 GB | legacy; large, balanced quality |
|
30 |
+
| [Q5_K_M](https://huggingface.co/Kquant03/FrankenDPO-4x7B-GGUF/blob/main/ggml-model-q5_k_m.gguf) | Q5_K_M | 5 | ~16.24 GB| ~18.24 GB | large, balanced quality - recommended |
|
31 |
+
| [Q6 XL](https://huggingface.co/Kquant03/FrankenDPO-4x7B-GGUF/blob/main/ggml-model-q6_k.gguf) | Q6_K | 6 | 19.35 GB| 21.35 GB | very large, extremely minor degradation |
|
32 |
+
| [Q8 XXL](https://huggingface.co/Kquant03/FrankenDPO-4x7B-GGUF/blob/main/ggml-model-q8_0.gguf) | Q8_0 | 8 | 25.1 GB| 27.1 GB | very large, extremely minor degradation - not recommended |
|
|
|
|
|
|
|
|
|
33 |
|
34 |
# "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
|
35 |
### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
|