stanfordnlp
/

SteamSHP-flan-t5-large

Text2Text Generation

preference model

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kawine commited on Feb 25, 2023

Commit

ee94256

•

1 Parent(s): 0504830

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -38,11 +38,11 @@ There is a larger variant called [SteamSHP-XL](https://huggingface.co/stanfordnl
 The input text should be of the format:
 ```
-POST: { the context, such as the 'history' column in SHP }
-RESPONSE A: { first possible continuation }
-RESPONSE B: { second possible continuation }
 Which response is better? RESPONSE
 ```
@@ -74,9 +74,9 @@ When trying to cram an example into 512 tokens, we recommend truncating the cont
 If you want to use SteamSHP-Large as a reward model -- to get a score for a single response -- then you need to structure the input such that RESPONSE A is what you want to score and RESPONSE B is just an empty input:
 ```
-POST: { the context, such as the 'history' column in SHP }
-RESPONSE A: { continuation }
 RESPONSE B: .

 The input text should be of the format:
 ```
+POST: { the context, such as the 'history' column in SHP (not containing any newlines \n) }
+RESPONSE A: { first possible continuation (not containing any newlines \n) }
+RESPONSE B: { second possible continuation (not containing any newlines \n) }
 Which response is better? RESPONSE
 ```
 If you want to use SteamSHP-Large as a reward model -- to get a score for a single response -- then you need to structure the input such that RESPONSE A is what you want to score and RESPONSE B is just an empty input:
 ```
+POST: { the context, such as the 'history' column in SHP (not containing any newlines \n) }
+RESPONSE A: { continuation (not containing any newlines \n) }
 RESPONSE B: .