Upload 4 files

Update model file

Files changed (3) hide show

README.md CHANGED Viewed

@@ -35,4 +35,25 @@ This is an attempt to replicate the RLHF pipeline
 ### Reinforcement Learning
   For RL we used the code of [trlx](https://github.com/CarperAI/trlx) and prompts from
-  - [fnlp/moss-002-sft-data](https://huggingface.co/datasets/fnlp/moss-002-sft-data/tree/main)

 ### Reinforcement Learning
   For RL we used the code of [trlx](https://github.com/CarperAI/trlx) and prompts from
+  - [fnlp/moss-002-sft-data](https://huggingface.co/datasets/fnlp/moss-002-sft-data/tree/main)
+### Example
+  We used Vicuna v1.1 template for model training
+  ```
+  from transformers import AutoModelForCausalLM, AutoTokenizer
+  checkpoint = "keyfan/bloomz-rlhf"
+  tokenizer = AutoTokenizer.from_pretrained(checkpoint)
+  model = AutoModelForCausalLM.from_pretrained(checkpoint).cuda()
+  template = ("A chat between a curious human and an artificial intelligence assistant. "
+              "The assistant gives helpful, detailed, and polite answers to the human's questions. "
+              "USER: {}\nASSISTANT:")
+  question = template.format("Who was the president of the United States in 1955?")
+  inputs = tokenizer.encode(question, return_tensors="pt").cuda()
+  outputs = model.generate(inputs, do_sample=True, top_p=0.8, max_new_tokens=512)
+  print(tokenizer.decode(outputs[0]))
+  ```

pytorch_model-00001-of-00002.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ab06ec23c2147f43f8f5f01c3e62c9306734b328c6b2d1530e22f4062660150e
 size 10848957472

 version https://git-lfs.github.com/spec/v1
+oid sha256:a9bce24cc1e1b2bc1d4d7a38399ce5666cb955ecc8fe1a106ea883de58c99f1c
 size 10848957472

pytorch_model-00002-of-00002.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:39acceaa7a6e1ad8a8d69901cf62f6e2bc479e8933755a441d178b37ee8539cd
 size 6284244953

 version https://git-lfs.github.com/spec/v1
+oid sha256:5fb6f7ea9c92823246b6eaedc807a60539caf1a43010d6d07a6e4a47c01dbe34
 size 6284244953