kawine commited on
Commit
ee94256
1 Parent(s): 0504830

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -38,11 +38,11 @@ There is a larger variant called [SteamSHP-XL](https://huggingface.co/stanfordnl
38
  The input text should be of the format:
39
 
40
  ```
41
- POST: { the context, such as the 'history' column in SHP }
42
 
43
- RESPONSE A: { first possible continuation }
44
 
45
- RESPONSE B: { second possible continuation }
46
 
47
  Which response is better? RESPONSE
48
  ```
@@ -74,9 +74,9 @@ When trying to cram an example into 512 tokens, we recommend truncating the cont
74
  If you want to use SteamSHP-Large as a reward model -- to get a score for a single response -- then you need to structure the input such that RESPONSE A is what you want to score and RESPONSE B is just an empty input:
75
 
76
  ```
77
- POST: { the context, such as the 'history' column in SHP }
78
 
79
- RESPONSE A: { continuation }
80
 
81
  RESPONSE B: .
82
 
 
38
  The input text should be of the format:
39
 
40
  ```
41
+ POST: { the context, such as the 'history' column in SHP (not containing any newlines \n) }
42
 
43
+ RESPONSE A: { first possible continuation (not containing any newlines \n) }
44
 
45
+ RESPONSE B: { second possible continuation (not containing any newlines \n) }
46
 
47
  Which response is better? RESPONSE
48
  ```
 
74
  If you want to use SteamSHP-Large as a reward model -- to get a score for a single response -- then you need to structure the input such that RESPONSE A is what you want to score and RESPONSE B is just an empty input:
75
 
76
  ```
77
+ POST: { the context, such as the 'history' column in SHP (not containing any newlines \n) }
78
 
79
+ RESPONSE A: { continuation (not containing any newlines \n) }
80
 
81
  RESPONSE B: .
82