Update README.md
Browse files
README.md
CHANGED
@@ -72,14 +72,16 @@ I'd recommend Mistral v2v3 prompting format:
|
|
72 |
|
73 |
I'm running the following sampler settings but this is an RC and they may not be optimal.
|
74 |
|
75 |
-
- **Temperature:**
|
76 |
-
- **Min-P:** 0.
|
77 |
- **Rep Pen:** 1.08
|
78 |
- **Rep Pen Range:** 1536
|
79 |
- **XTC:** 0.1/0.15
|
80 |
|
81 |
If you get completely incoherent responses, feel free to use these as a starting point.
|
82 |
|
|
|
|
|
83 |
# Training Strategy
|
84 |
|
85 |
I started with a finetune of Mistral Small 22B which had been trained on the Gutenberg dataset: [nbeerbower/Mistral-Small-Gutenberg-Doppel-22B](https://huggingface.co/nbeerbower/Mistral-Small-Gutenberg-Doppel-22B).
|
|
|
72 |
|
73 |
I'm running the following sampler settings but this is an RC and they may not be optimal.
|
74 |
|
75 |
+
- **Temperature:** 1
|
76 |
+
- **Min-P:** 0.1
|
77 |
- **Rep Pen:** 1.08
|
78 |
- **Rep Pen Range:** 1536
|
79 |
- **XTC:** 0.1/0.15
|
80 |
|
81 |
If you get completely incoherent responses, feel free to use these as a starting point.
|
82 |
|
83 |
+
**High temperature settings (above 1) tend to create less coherent responses**.
|
84 |
+
|
85 |
# Training Strategy
|
86 |
|
87 |
I started with a finetune of Mistral Small 22B which had been trained on the Gutenberg dataset: [nbeerbower/Mistral-Small-Gutenberg-Doppel-22B](https://huggingface.co/nbeerbower/Mistral-Small-Gutenberg-Doppel-22B).
|