Upload 4 files
Browse filesUpdate model file
- README.md +22 -1
- pytorch_model-00001-of-00002.bin +1 -1
- pytorch_model-00002-of-00002.bin +1 -1
README.md
CHANGED
@@ -35,4 +35,25 @@ This is an attempt to replicate the RLHF pipeline
|
|
35 |
### Reinforcement Learning
|
36 |
|
37 |
For RL we used the code of [trlx](https://github.com/CarperAI/trlx) and prompts from
|
38 |
-
- [fnlp/moss-002-sft-data](https://huggingface.co/datasets/fnlp/moss-002-sft-data/tree/main)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
### Reinforcement Learning
|
36 |
|
37 |
For RL we used the code of [trlx](https://github.com/CarperAI/trlx) and prompts from
|
38 |
+
- [fnlp/moss-002-sft-data](https://huggingface.co/datasets/fnlp/moss-002-sft-data/tree/main)
|
39 |
+
|
40 |
+
### Example
|
41 |
+
|
42 |
+
We used Vicuna v1.1 template for model training
|
43 |
+
|
44 |
+
```
|
45 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
46 |
+
|
47 |
+
checkpoint = "keyfan/bloomz-rlhf"
|
48 |
+
|
49 |
+
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
|
50 |
+
model = AutoModelForCausalLM.from_pretrained(checkpoint).cuda()
|
51 |
+
|
52 |
+
template = ("A chat between a curious human and an artificial intelligence assistant. "
|
53 |
+
"The assistant gives helpful, detailed, and polite answers to the human's questions. "
|
54 |
+
"USER: {}\nASSISTANT:")
|
55 |
+
question = template.format("Who was the president of the United States in 1955?")
|
56 |
+
inputs = tokenizer.encode(question, return_tensors="pt").cuda()
|
57 |
+
outputs = model.generate(inputs, do_sample=True, top_p=0.8, max_new_tokens=512)
|
58 |
+
print(tokenizer.decode(outputs[0]))
|
59 |
+
```
|
pytorch_model-00001-of-00002.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 10848957472
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a9bce24cc1e1b2bc1d4d7a38399ce5666cb955ecc8fe1a106ea883de58c99f1c
|
3 |
size 10848957472
|
pytorch_model-00002-of-00002.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 6284244953
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5fb6f7ea9c92823246b6eaedc807a60539caf1a43010d6d07a6e4a47c01dbe34
|
3 |
size 6284244953
|