ShawLiu commited on
Commit
f7d01eb
1 Parent(s): 9b8c1f5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -0
README.md ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - bpo
6
+ - llama
7
+ - thudm
8
+ inference: false
9
+ ---
10
+
11
+ <h1>Black-Box Prompt Optimization: Aligning Large Language Models without Model Training</h1>
12
+
13
+ - **Repository:** https://github.com/thu-coai/BPO
14
+ - **Paper:** https://arxiv.org/abs/2311.04155
15
+ - **Data:** https://huggingface.co/datasets/THUDM/BPO
16
+
17
+ # Black-box Prompt Optimization (BPO)
18
+ BPO is a black-box alignment technique that differs from training-based methods (like PPO or DPO). BPO only requires training of a plug-and-play model and optimizes LLMs through optimizing user inputs. Therefore, it can be used on a variety of open-source or API-based LLMs.
19
+
20
+ ## Model Details
21
+
22
+ ### Data
23
+ Prompt优化模型由隐含人类偏好特征的prompt优化对训练得到,数据集的详细信息在这里。
24
+ The Prompt Optimization Model is trained on prompt optimization pairs which contain human preference features. Detailed information on the dataset can be found [here](https://huggingface.co/datasets/CCCCCC/BPO).
25
+
26
+ ### Backbone Model
27
+ The prompt preference optimizer is built on `Llama-2-7b-chat-hf`.
28
+
29
+ ### Language
30
+ English
31
+
32
+ ### Performance
33
+
34
+
35
+ | Model A| Model B | A win | tie | B win |
36
+ |-------------|-------------|----|----|----|
37
+ | gpt-3.5-turbo + BPO | gpt-3.5-turbo | **60.0** | 8.7 | 31.3 |
38
+ | claude-2 + BPO | claude-2 | **57.5** | 5.0 | 37.5 |
39
+ | llama-2-13b-chat + BPO | llama-2-70b-chat | **61.3** | 0.0 | 38.7 |
40
+ | vicuna-13b + BPO | vicuna-13b + PPO | **52.5** | 3.7 | 43.7 |
41
+ | vicuna-13b + BPO | vicuna-13b + DPO | **53.8** | 2.5 | 43.7 |
42
+ | vicuna-13b + DPO + BPO | vicuna-13b + DPO | **60.0** | 2.5 | 37.5 |
43
+
44
+ ## Intended Use
45
+
46
+ ### Prompt Template
47
+ We adopt a prompt template as
48
+ ```
49
+ [INST] You are an expert prompt engineer. Please help me improve this prompt to get a more helpful and harmless response:\n{user prompt} [/INST]
50
+ ```
51
+
52
+ ### Inference code
53
+ Here is an example code for inference:
54
+ ```python
55
+ from transformers import AutoModelForCausalLM, AutoTokenizer
56
+
57
+ model_path = 'Your-Model-Path'
58
+
59
+ prompt_template = "[INST] You are an expert prompt engineer. Please help me improve this prompt to get a more helpful and harmless response:\n{} [/INST]"
60
+
61
+ model = AutoModelForCausalLM.from_pretrained(model_path).cuda()
62
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
63
+
64
+ text = 'Tell me about Harry Potter'
65
+
66
+ prompt = prompt_template.format(text)
67
+ model_inputs = tokenizer(prompt, return_tensors="pt").to("cuda:0")
68
+ output = model.generate(**model_inputs, max_new_tokens=1024, do_sample=True, top_p=0.9, temperature=0.6, num_beams=1)
69
+ resp = tokenizer.decode(output[0], skip_special_tokens=True).split('[/INST]')[1].strip()
70
+
71
+ print(resp)
72
+ ```
73
+ See our [Github Repo](https://github.com/thu-coai/BPO/blob/main/src/infer_example.py) for more detailed usage (e.g. more aggressive optimization).
74
+
75
+
76
+ ### Other Known Limitations
77
+ - Task coverage is not sufficient, as we only used open-source data to get about 14k optimized prompts. Clearly, it is impossible to cover a wide range of user queries, so the current model may not perform well on every prompt.
78
+ - Due to the small ratio of long-context-based tasks and mathematical problems, the prompt optimizer underperforms when dealing with these tasks.
79
+
80
+ ## Citation
81
+ If you find our model is useful in your work, please cite it with:
82
+ ```
83
+ @article{cheng2023black,
84
+ title={Black-Box Prompt Optimization: Aligning Large Language Models without Model Training},
85
+ author={Cheng, Jiale and Liu, Xiao and Zheng, Kehan and Ke, Pei and Wang, Hongning and Dong, Yuxiao and Tang, Jie and Huang, Minlie},
86
+ journal={arXiv preprint arXiv:2311.04155},
87
+ year={2023}
88
+ }
89
+ ```