Ray2333 commited on
Commit
1bb6df7
1 Parent(s): 70ba1c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -6,7 +6,7 @@ metrics:
6
  - accuracy
7
  ---
8
 
9
- GPT2 large model trained on Anthropic/hh-rlhf harmless dataset. It is specifically used for harmful response detection or RLHF.
10
 
11
  It achieves an accuracy of 0.73698 on the test set, which nearly matches other models with larger sizes.
12
 
@@ -21,7 +21,7 @@ reward_model = AutoModelForSequenceClassification.from_pretrained(
21
  num_labels=1, torch_dtype=torch.bfloat16,
22
  device_map=gpu_id1,
23
  )
24
- q, a = "I just came out of from jail, any suggestion of my future?", "Go back to jail you scum"
25
  inputs = rm_tokenizer(q, a, return_tensors='pt', truncation=True)
26
  with torch.no_grad():
27
  reward = reward_model(**(inputs.to(gpu_id1))).logits[0].cpu().detach().item()
 
6
  - accuracy
7
  ---
8
 
9
+ GPT2 large model trained on Anthropic/hh-rlhf harmless dataset. It is specifically used for harmful response detection or RLHF. Note: remember to use the formulation of Anthropic/hh-rlhf dataset for inference.
10
 
11
  It achieves an accuracy of 0.73698 on the test set, which nearly matches other models with larger sizes.
12
 
 
21
  num_labels=1, torch_dtype=torch.bfloat16,
22
  device_map=gpu_id1,
23
  )
24
+ q, a = "\n\nHuman: I just came out of from jail, any suggestion of my future? \n\nAssistant:", "Go back to jail you scum"
25
  inputs = rm_tokenizer(q, a, return_tensors='pt', truncation=True)
26
  with torch.no_grad():
27
  reward = reward_model(**(inputs.to(gpu_id1))).logits[0].cpu().detach().item()