Update README.md
Browse files
README.md
CHANGED
@@ -7,6 +7,9 @@ tags:
|
|
7 |
|
8 |
**NOTE**: See [creative-writing-control-vectors-v2.1](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1) for the current main control-vector repo.
|
9 |
|
|
|
|
|
|
|
10 |
## Details
|
11 |
|
12 |
The control-vectors in this repo were created as an experiment by increasing the triplets in `system_messages_outlook_extended.json` by 4x (click to expand):
|
@@ -146,11 +149,11 @@ The control-vectors in this repo were created as an experiment by increasing the
|
|
146 |
|
147 |
</details>
|
148 |
|
149 |
-
|
150 |
|
151 |
## Regularisation
|
152 |
|
153 |
-
I also include 3 different values for the
|
154 |
|
155 |
- [regularisation_factor = 1.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%201.0)
|
156 |
- [regularisation_factor = 0.5](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%200.5)
|
@@ -158,7 +161,7 @@ I also include 3 different values for the `--regularisation_factor` option; `1.0
|
|
158 |
|
159 |
Try to use the largest `regularisation_factor` that has the desired effect - this has the least chance of damaging the models' outputs.
|
160 |
|
161 |
-
## Prompting format for `Mistral-Large-Instruct-2407`, `WizardLM-2-8x22B` and `miqu-1-70b`:
|
162 |
|
163 |
I have found by testing that these models seems to work much better for creative writing if you use the following 'Vicuna' prompt template:
|
164 |
|
@@ -175,4 +178,4 @@ so I altered the 'Jinja2' `chat_template` in the `tokenizer_config.json` for `Mi
|
|
175 |
}
|
176 |
```
|
177 |
|
178 |
-
**NOTE**: I still used the default prompt templates for the other 3 models (`c4ai-command-r-plus`, `c4ai-command-r-v01` and `gemma-2-27b-it`).
|
|
|
7 |
|
8 |
**NOTE**: See [creative-writing-control-vectors-v2.1](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1) for the current main control-vector repo.
|
9 |
|
10 |
+
- *08/08/24 - Added `'WizardLM-2-8x22B'`, `'c4ai-command-r-v01'` and `'gemma-2-27b-it'`.*
|
11 |
+
- *09/08/24 - Added `'miqu-1-70b'`.*
|
12 |
+
|
13 |
## Details
|
14 |
|
15 |
The control-vectors in this repo were created as an experiment by increasing the triplets in `system_messages_outlook_extended.json` by 4x (click to expand):
|
|
|
149 |
|
150 |
</details>
|
151 |
|
152 |
+
So now each models' cross-covariance matrix is the result of `120,000` hidden state samples and thus for the largest models (with `hidden_dim = 12288`) uses at least 10 samples per element.
|
153 |
|
154 |
## Regularisation
|
155 |
|
156 |
+
I also include 3 different values for the `''--regularisation_factor'` option:
|
157 |
|
158 |
- [regularisation_factor = 1.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%201.0)
|
159 |
- [regularisation_factor = 0.5](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%200.5)
|
|
|
161 |
|
162 |
Try to use the largest `regularisation_factor` that has the desired effect - this has the least chance of damaging the models' outputs.
|
163 |
|
164 |
+
## Prompting format for `'Mistral-Large-Instruct-2407'`, `'WizardLM-2-8x22B'` and `'miqu-1-70b'`:
|
165 |
|
166 |
I have found by testing that these models seems to work much better for creative writing if you use the following 'Vicuna' prompt template:
|
167 |
|
|
|
178 |
}
|
179 |
```
|
180 |
|
181 |
+
**NOTE**: I still used the default prompt templates for the other 3 models (`'c4ai-command-r-plus'`, `'c4ai-command-r-v01'` and `'gemma-2-27b-it'`).
|