Text Generation
Transformers
Safetensors
mistral
conversational
Inference Endpoints
text-generation-inference
jondurbin commited on
Commit
e68d435
1 Parent(s): 8205857

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -17
README.md CHANGED
@@ -42,6 +42,7 @@ datasets:
42
  - WhiteRabbitNeo/WRN-Chapter-1
43
  - WhiteRabbitNeo/WRN-Chapter-2
44
  - winogrande
 
45
 
46
  # A bagel, with everything (except DPO)
47
 
@@ -63,6 +64,7 @@ __*Only train splits are used, and a decontamination by cosine similarity is per
63
 
64
  <details>
65
  <summary>SFT data sources</summary>
 
66
  - [ai2_arc](https://huggingface.co/datasets/ai2_arc)
67
  - Abstraction and reasoning dataset, useful in measuring "intelligence" to a certain extent.
68
  - [airoboros](https://huggingface.co/datasets/unalignment/spicy-3.1)
@@ -135,6 +137,7 @@ __*Only train splits are used, and a decontamination by cosine similarity is per
135
 
136
  <details>
137
  <summary>DPO data sources</summary>
 
138
  - [airoboros 3.2](https://huggingface.co/datasets/jondurbin/airoboros-3.2) vs [airoboros m2.0](https://huggingface.co/datasets/jondurbin/airoboros-gpt4-m2.0)
139
  - The creative/writing tasks from airoboros-2.2.1 were re-generated using gpt4-0314 and a custom prompt to get longer, more creative, less clichè responses for airoboros 3.1, so we can use the shorter/boring version as the "rejected" value and the rerolled response as "chosen"
140
  - [contextual-dpo](https://huggingface.co/datasets/jondurbin/contextual-dpo-v0.1)
@@ -162,6 +165,18 @@ I also didn't want to randomly select a single prompt format for each item (hopi
162
 
163
  This means each epoch of our fine-tune is the equivalent of 3 epochs.
164
 
 
 
 
 
 
 
 
 
 
 
 
 
165
  <details>
166
  <summary>Alpaca (sort of)</summary>
167
 
@@ -179,7 +194,8 @@ This means each epoch of our fine-tune is the equivalent of 3 epochs.
179
 
180
  The main difference here is that because of the dataset formatting and variety of data sources, it would have been much to tedious to add an `### Input:` block, so the inputs are just in the instruction section.
181
 
182
- ### Vicuna
 
183
 
184
  ```
185
  {system prompt, if provided, randomly defaulting to "A chat between a user and an unbiased, uncensored assistant."}
@@ -188,20 +204,12 @@ This means each epoch of our fine-tune is the equivalent of 3 epochs.
188
  ```
189
  </details>
190
 
191
- ### ChatML (sort of)
192
-
193
- ```text
194
- {bos}<|im_start|>{role}
195
- {text}
196
- <|im_end|>{eos}
197
- ```
198
-
199
- ### Llama-2 chat (recommended)
200
-
201
- ```
202
- [INST] <<SYS>>
203
- {system}
204
- <</SYS>>
205
 
206
- {instruction} [/INST]
207
- ```
 
 
 
 
 
42
  - WhiteRabbitNeo/WRN-Chapter-1
43
  - WhiteRabbitNeo/WRN-Chapter-2
44
  - winogrande
45
+ ---
46
 
47
  # A bagel, with everything (except DPO)
48
 
 
64
 
65
  <details>
66
  <summary>SFT data sources</summary>
67
+
68
  - [ai2_arc](https://huggingface.co/datasets/ai2_arc)
69
  - Abstraction and reasoning dataset, useful in measuring "intelligence" to a certain extent.
70
  - [airoboros](https://huggingface.co/datasets/unalignment/spicy-3.1)
 
137
 
138
  <details>
139
  <summary>DPO data sources</summary>
140
+
141
  - [airoboros 3.2](https://huggingface.co/datasets/jondurbin/airoboros-3.2) vs [airoboros m2.0](https://huggingface.co/datasets/jondurbin/airoboros-gpt4-m2.0)
142
  - The creative/writing tasks from airoboros-2.2.1 were re-generated using gpt4-0314 and a custom prompt to get longer, more creative, less clichè responses for airoboros 3.1, so we can use the shorter/boring version as the "rejected" value and the rerolled response as "chosen"
143
  - [contextual-dpo](https://huggingface.co/datasets/jondurbin/contextual-dpo-v0.1)
 
165
 
166
  This means each epoch of our fine-tune is the equivalent of 3 epochs.
167
 
168
+ <details>
169
+ <summary>Llama-2 chat (recommended)</summary>
170
+
171
+ ```
172
+ [INST] <<SYS>>
173
+ {system}
174
+ <</SYS>>
175
+
176
+ {instruction} [/INST]
177
+ ```
178
+ </details>
179
+
180
  <details>
181
  <summary>Alpaca (sort of)</summary>
182
 
 
194
 
195
  The main difference here is that because of the dataset formatting and variety of data sources, it would have been much to tedious to add an `### Input:` block, so the inputs are just in the instruction section.
196
 
197
+ <details>
198
+ <summary>Vicuna</summary>
199
 
200
  ```
201
  {system prompt, if provided, randomly defaulting to "A chat between a user and an unbiased, uncensored assistant."}
 
204
  ```
205
  </details>
206
 
207
+ <details>
208
+ <summary>ChatML</summary>
 
 
 
 
 
 
 
 
 
 
 
 
209
 
210
+ ```text
211
+ {bos}<|im_start|>{role}
212
+ {text}
213
+ <|im_end|>{eos}
214
+ ```
215
+ </details>