TheBloke commited on
Commit
8b50f4f
1 Parent(s): 313d9f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -53,7 +53,7 @@ I have quantised the GGML files in this repo with the latest version. Therefore
53
  I use the following command line; adjust for your tastes and needs:
54
 
55
  ```
56
- ./main -t 12 -m WizardLM-13B-1.0.v3.q5_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Below is an instruction that describes a task. Write a response that appropriately completes the request.
57
  ### Instruction:
58
  Write a story about llamas
59
  ### Response:"
@@ -83,7 +83,7 @@ So if you're able and willing to contribute, it'd be most gratefully received an
83
  Empowering Large Pre-Trained Language Models to Follow Complex Instructions
84
 
85
  <p align="center" width="100%">
86
- <a ><img src="imgs/WizardLM.png" alt="WizardLM" style="width: 20%; min-width: 300px; display: block; margin: auto;"></a>
87
  </p>
88
 
89
  [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/tatsu-lab/stanford_alpaca/blob/main/LICENSE)
@@ -108,7 +108,7 @@ At present, our core contributors are preparing the **33B** version and we expec
108
 
109
  We adopt the automatic evaluation framework based on GPT-4 proposed by FastChat to assess the performance of chatbot models. As shown in the following figure, WizardLM-13B achieved better results than Vicuna-13b.
110
  <p align="center" width="100%">
111
- <a ><img src="imgs/WizarLM13b-GPT4.png" alt="WizardLM" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
112
  </p>
113
 
114
  ### WizardLM-13B performance on different skills.
@@ -116,7 +116,7 @@ We adopt the automatic evaluation framework based on GPT-4 proposed by FastChat
116
  The following figure compares WizardLM-13B and ChatGPT’s skill on Evol-Instruct testset. The result indicates that WizardLM-13B achieves 89.1% of ChatGPT’s performance on average, with almost 100% (or more than) capacity on 10 skills, and more than 90% capacity on 22 skills.
117
 
118
  <p align="center" width="100%">
119
- <a ><img src="imgs/evol-testset_skills-13b.png" alt="WizardLM" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
120
  </p>
121
 
122
  ## Call for Feedbacks
@@ -135,11 +135,11 @@ We just sample some cases to demonstrate the performance of WizardLM and ChatGPT
135
  [Evol-Instruct](https://github.com/nlpxucan/evol-instruct) is a novel method using LLMs instead of humans to automatically mass-produce open-domain instructions of various difficulty levels and skills range, to improve the performance of LLMs.
136
 
137
  <p align="center" width="100%">
138
- <a ><img src="imgs/git_overall.png" alt="WizardLM" style="width: 86%; min-width: 300px; display: block; margin: auto;"></a>
139
  </p>
140
 
141
  <p align="center" width="100%">
142
- <a ><img src="imgs/git_running.png" alt="WizardLM" style="width: 86%; min-width: 300px; display: block; margin: auto;"></a>
143
  </p>
144
 
145
  ## Contents
@@ -254,12 +254,12 @@ To evaluate Wizard, we conduct human evaluation on the inputs from our human ins
254
 
255
  WizardLM achieved significantly better results than Alpaca and Vicuna-7b.
256
  <p align="center" width="60%">
257
- <a ><img src="imgs/win.png" alt="WizardLM" style="width: 60%; min-width: 300px; display: block; margin: auto;"></a>
258
  </p>
259
 
260
  In the high-difficulty section of our test set (difficulty level >= 8), WizardLM even outperforms ChatGPT, with a win rate 7.9% larger than Chatgpt (42.9% vs. 35.0%). This indicates that our method can significantly improve the ability of large language models to handle complex instructions.
261
  <p align="center" width="60%">
262
- <a ><img src="imgs/windiff.png" alt="WizardLM" style="width: 60%; min-width: 300px; display: block; margin: auto;"></a>
263
  </p>
264
 
265
  ### Citation
 
53
  I use the following command line; adjust for your tastes and needs:
54
 
55
  ```
56
+ ./main -t 12 -m WizardLM-13B-1.0.ggmlv3.q5_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Below is an instruction that describes a task. Write a response that appropriately completes the request.
57
  ### Instruction:
58
  Write a story about llamas
59
  ### Response:"
 
83
  Empowering Large Pre-Trained Language Models to Follow Complex Instructions
84
 
85
  <p align="center" width="100%">
86
+ <a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/imgs/WizardLM.png" alt="WizardLM" style="width: 20%; min-width: 300px; display: block; margin: auto;"></a>
87
  </p>
88
 
89
  [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/tatsu-lab/stanford_alpaca/blob/main/LICENSE)
 
108
 
109
  We adopt the automatic evaluation framework based on GPT-4 proposed by FastChat to assess the performance of chatbot models. As shown in the following figure, WizardLM-13B achieved better results than Vicuna-13b.
110
  <p align="center" width="100%">
111
+ <a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/imgs/WizarLM13b-GPT4.png" alt="WizardLM" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
112
  </p>
113
 
114
  ### WizardLM-13B performance on different skills.
 
116
  The following figure compares WizardLM-13B and ChatGPT’s skill on Evol-Instruct testset. The result indicates that WizardLM-13B achieves 89.1% of ChatGPT’s performance on average, with almost 100% (or more than) capacity on 10 skills, and more than 90% capacity on 22 skills.
117
 
118
  <p align="center" width="100%">
119
+ <a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/imgs/evol-testset_skills-13b.png" alt="WizardLM" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
120
  </p>
121
 
122
  ## Call for Feedbacks
 
135
  [Evol-Instruct](https://github.com/nlpxucan/evol-instruct) is a novel method using LLMs instead of humans to automatically mass-produce open-domain instructions of various difficulty levels and skills range, to improve the performance of LLMs.
136
 
137
  <p align="center" width="100%">
138
+ <a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/imgs/git_overall.png" alt="WizardLM" style="width: 86%; min-width: 300px; display: block; margin: auto;"></a>
139
  </p>
140
 
141
  <p align="center" width="100%">
142
+ <a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/imgs/git_running.png" alt="WizardLM" style="width: 86%; min-width: 300px; display: block; margin: auto;"></a>
143
  </p>
144
 
145
  ## Contents
 
254
 
255
  WizardLM achieved significantly better results than Alpaca and Vicuna-7b.
256
  <p align="center" width="60%">
257
+ <a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/imgs/win.png" alt="WizardLM" style="width: 60%; min-width: 300px; display: block; margin: auto;"></a>
258
  </p>
259
 
260
  In the high-difficulty section of our test set (difficulty level >= 8), WizardLM even outperforms ChatGPT, with a win rate 7.9% larger than Chatgpt (42.9% vs. 35.0%). This indicates that our method can significantly improve the ability of large language models to handle complex instructions.
261
  <p align="center" width="60%">
262
+ <a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/imgs/windiff.png" alt="WizardLM" style="width: 60%; min-width: 300px; display: block; margin: auto;"></a>
263
  </p>
264
 
265
  ### Citation