jukofyork commited on
Commit
7dd096f
1 Parent(s): 69d01ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -7,6 +7,9 @@ tags:
7
 
8
  **NOTE**: See [creative-writing-control-vectors-v2.1](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1) for the current main control-vector repo.
9
 
 
 
 
10
  ## Details
11
 
12
  The control-vectors in this repo were created as an experiment by increasing the triplets in `system_messages_outlook_extended.json` by 4x (click to expand):
@@ -146,11 +149,11 @@ The control-vectors in this repo were created as an experiment by increasing the
146
 
147
  </details>
148
 
149
- This means each models' cross-covariance matrix will be the result of `120,000` hidden state samples and this in turn will mean each uses ~10x the hidden state dimension for the largest models with `hidden_dim = 12288`.
150
 
151
  ## Regularisation
152
 
153
- I also include 3 different values for the `--regularisation_factor` option; `1.0` (the default), `0.5` and `0.0`:
154
 
155
  - [regularisation_factor = 1.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%201.0)
156
  - [regularisation_factor = 0.5](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%200.5)
@@ -158,7 +161,7 @@ I also include 3 different values for the `--regularisation_factor` option; `1.0
158
 
159
  Try to use the largest `regularisation_factor` that has the desired effect - this has the least chance of damaging the models' outputs.
160
 
161
- ## Prompting format for `Mistral-Large-Instruct-2407`, `WizardLM-2-8x22B` and `miqu-1-70b`:
162
 
163
  I have found by testing that these models seems to work much better for creative writing if you use the following 'Vicuna' prompt template:
164
 
@@ -175,4 +178,4 @@ so I altered the 'Jinja2' `chat_template` in the `tokenizer_config.json` for `Mi
175
  }
176
  ```
177
 
178
- **NOTE**: I still used the default prompt templates for the other 3 models (`c4ai-command-r-plus`, `c4ai-command-r-v01` and `gemma-2-27b-it`).
 
7
 
8
  **NOTE**: See [creative-writing-control-vectors-v2.1](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1) for the current main control-vector repo.
9
 
10
+ - *08/08/24 - Added `'WizardLM-2-8x22B'`, `'c4ai-command-r-v01'` and `'gemma-2-27b-it'`.*
11
+ - *09/08/24 - Added `'miqu-1-70b'`.*
12
+
13
  ## Details
14
 
15
  The control-vectors in this repo were created as an experiment by increasing the triplets in `system_messages_outlook_extended.json` by 4x (click to expand):
 
149
 
150
  </details>
151
 
152
+ So now each models' cross-covariance matrix is the result of `120,000` hidden state samples and thus for the largest models (with `hidden_dim = 12288`) uses at least 10 samples per element.
153
 
154
  ## Regularisation
155
 
156
+ I also include 3 different values for the `''--regularisation_factor'` option:
157
 
158
  - [regularisation_factor = 1.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%201.0)
159
  - [regularisation_factor = 0.5](https://huggingface.co/jukofyork/creative-writing-control-vectors-v2.1.2-EXPERIMENTAL/tree/main/regularisation_factor%20%3D%200.5)
 
161
 
162
  Try to use the largest `regularisation_factor` that has the desired effect - this has the least chance of damaging the models' outputs.
163
 
164
+ ## Prompting format for `'Mistral-Large-Instruct-2407'`, `'WizardLM-2-8x22B'` and `'miqu-1-70b'`:
165
 
166
  I have found by testing that these models seems to work much better for creative writing if you use the following 'Vicuna' prompt template:
167
 
 
178
  }
179
  ```
180
 
181
+ **NOTE**: I still used the default prompt templates for the other 3 models (`'c4ai-command-r-plus'`, `'c4ai-command-r-v01'` and `'gemma-2-27b-it'`).