natolambert commited on
Commit
3888820
1 Parent(s): da55df7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model for testing RM scripts
2
+ This model is just GPT2 base (~100M param) with a value head appended, untrained.
3
+ Use this for debugging RLHF setups (could make a smaller one too).
4
+ The predictions should be somewhat random.
5
+
6
+ Load the model as follows:
7
+ ```
8
+ from transformers import AutoModelForSequenceClassification
9
+ rm = AutoModelForSequenceClassification.from_pretrained("natolambert/gpt2-dummy-rm")
10
+ ```
11
+ or as a pipeline
12
+ ```
13
+ from Transformers import pipeline
14
+ reward_pipe = pipeline(
15
+ "text-classification",
16
+ model="natolambert/gpt2-dummy-rm",
17
+ # revision=args.model_revision,
18
+ # model_kwargs={"load_in_8bit": True, "device_map": {"": current_device}, "torch_dtype": torch.float16},
19
+ )
20
+ reward_pipeline_kwargs = {}
21
+ pipe_outputs = reward_pipe(texts, **reward_pipeline_kwargs)
22
+ ```
23
+
24
+
25
+
26
+
27
+