keyfan commited on
Commit
b250fa3
·
1 Parent(s): aab5177

Upload 4 files

Browse files

Update model file

README.md CHANGED
@@ -35,4 +35,25 @@ This is an attempt to replicate the RLHF pipeline
35
  ### Reinforcement Learning
36
 
37
  For RL we used the code of [trlx](https://github.com/CarperAI/trlx) and prompts from
38
- - [fnlp/moss-002-sft-data](https://huggingface.co/datasets/fnlp/moss-002-sft-data/tree/main)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ### Reinforcement Learning
36
 
37
  For RL we used the code of [trlx](https://github.com/CarperAI/trlx) and prompts from
38
+ - [fnlp/moss-002-sft-data](https://huggingface.co/datasets/fnlp/moss-002-sft-data/tree/main)
39
+
40
+ ### Example
41
+
42
+ We used Vicuna v1.1 template for model training
43
+
44
+ ```
45
+ from transformers import AutoModelForCausalLM, AutoTokenizer
46
+
47
+ checkpoint = "keyfan/bloomz-rlhf"
48
+
49
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
50
+ model = AutoModelForCausalLM.from_pretrained(checkpoint).cuda()
51
+
52
+ template = ("A chat between a curious human and an artificial intelligence assistant. "
53
+ "The assistant gives helpful, detailed, and polite answers to the human's questions. "
54
+ "USER: {}\nASSISTANT:")
55
+ question = template.format("Who was the president of the United States in 1955?")
56
+ inputs = tokenizer.encode(question, return_tensors="pt").cuda()
57
+ outputs = model.generate(inputs, do_sample=True, top_p=0.8, max_new_tokens=512)
58
+ print(tokenizer.decode(outputs[0]))
59
+ ```
pytorch_model-00001-of-00002.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ab06ec23c2147f43f8f5f01c3e62c9306734b328c6b2d1530e22f4062660150e
3
  size 10848957472
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a9bce24cc1e1b2bc1d4d7a38399ce5666cb955ecc8fe1a106ea883de58c99f1c
3
  size 10848957472
pytorch_model-00002-of-00002.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:39acceaa7a6e1ad8a8d69901cf62f6e2bc479e8933755a441d178b37ee8539cd
3
  size 6284244953
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5fb6f7ea9c92823246b6eaedc807a60539caf1a43010d6d07a6e4a47c01dbe34
3
  size 6284244953