imone commited on
Commit
59ec46c
β€’
1 Parent(s): 7270a6d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -33,7 +33,7 @@
33
  <br> πŸ€– Outperforms <span style="font-weight: bold;">ChatGPT</span> (March) and <span style="font-weight: bold;">Grok-1</span> on most benchmarks πŸ€–
34
  <br> πŸš€<span style="font-size: 1em; font-family: 'Helvetica'; color: black; font-weight: bold;">15</span>-point improvement in Coding Performance over <span style="font-size: 0.9em;
35
  font-family: 'Helvetica'; color: black; font-weight: bold;">OpenChat-3.5πŸš€</span>
36
- <br><span style="font-size: 1em; font-family: 'Helvetica'; color: #3c72db; font-weight: bold;">New Features</span>
37
  <br> πŸ’‘ 2 Modes: Coding + Generalist, Mathematical Reasoning πŸ’‘
38
  <br> πŸ§‘β€βš–οΈ Experimental support for Evaluator and Feedback capabilities πŸ§‘β€βš–οΈ
39
  </span>
@@ -41,7 +41,7 @@
41
  </div>
42
 
43
  <div style="display: flex; justify-content: center; align-items: center">
44
- <img src="https://github.com/alpayariyak/openchat/blob/master/assets/1210bench.png?raw=true" style="width: 100%; border-radius: 1em">">
45
  </div>
46
 
47
  <div>
@@ -174,6 +174,7 @@ Score 5: {orig_score5_description}
174
  | OpenOrca Mistral | 7B | 52.7 | 6.86 | 38.4 | 49.4 | 42.9 | 45.9 | 59.3 | 59.1 | 58.1 |
175
  | Zephyr-Ξ²^ | 7B | 34.6 | 7.34 | 22.0 | 40.6 | 39.0 | 40.8 | 39.8 | 5.1 | 16.0 |
176
  | Mistral | 7B | - | 6.84 | 30.5 | 39.0 | 38.0 | - | 60.1 | 52.2 | - |
 
177
  <details>
178
  <summary>Evaluation Details(click to expand)</summary>
179
  *: ChatGPT (March) results are from [GPT-4 Technical Report](https://arxiv.org/abs/2303.08774), [Chain-of-Thought Hub](https://github.com/FranxYao/chain-of-thought-hub), and our evaluation. Please note that ChatGPT is not a fixed baseline and evolves rapidly over time.
@@ -188,7 +189,6 @@ All models are evaluated in chat mode (e.g. with the respective conversation tem
188
  <h3>HumanEval+</h3>
189
  </div>
190
 
191
-
192
  | Model | Size | HumanEval+ pass@1 |
193
  |-----------------------------|----------|------------|
194
  | ChatGPT (December 12, 2023) | - | 64.6 |
 
33
  <br> πŸ€– Outperforms <span style="font-weight: bold;">ChatGPT</span> (March) and <span style="font-weight: bold;">Grok-1</span> on most benchmarks πŸ€–
34
  <br> πŸš€<span style="font-size: 1em; font-family: 'Helvetica'; color: black; font-weight: bold;">15</span>-point improvement in Coding Performance over <span style="font-size: 0.9em;
35
  font-family: 'Helvetica'; color: black; font-weight: bold;">OpenChat-3.5πŸš€</span>
36
+ <br><br><span style="font-size: 1em; font-family: 'Helvetica'; color: #3c72db; font-weight: bold;">New Features</span>
37
  <br> πŸ’‘ 2 Modes: Coding + Generalist, Mathematical Reasoning πŸ’‘
38
  <br> πŸ§‘β€βš–οΈ Experimental support for Evaluator and Feedback capabilities πŸ§‘β€βš–οΈ
39
  </span>
 
41
  </div>
42
 
43
  <div style="display: flex; justify-content: center; align-items: center">
44
+ <img src="https://github.com/alpayariyak/openchat/blob/master/assets/1210bench.png?raw=true" style="width: 100%; border-radius: 1em">
45
  </div>
46
 
47
  <div>
 
174
  | OpenOrca Mistral | 7B | 52.7 | 6.86 | 38.4 | 49.4 | 42.9 | 45.9 | 59.3 | 59.1 | 58.1 |
175
  | Zephyr-Ξ²^ | 7B | 34.6 | 7.34 | 22.0 | 40.6 | 39.0 | 40.8 | 39.8 | 5.1 | 16.0 |
176
  | Mistral | 7B | - | 6.84 | 30.5 | 39.0 | 38.0 | - | 60.1 | 52.2 | - |
177
+
178
  <details>
179
  <summary>Evaluation Details(click to expand)</summary>
180
  *: ChatGPT (March) results are from [GPT-4 Technical Report](https://arxiv.org/abs/2303.08774), [Chain-of-Thought Hub](https://github.com/FranxYao/chain-of-thought-hub), and our evaluation. Please note that ChatGPT is not a fixed baseline and evolves rapidly over time.
 
189
  <h3>HumanEval+</h3>
190
  </div>
191
 
 
192
  | Model | Size | HumanEval+ pass@1 |
193
  |-----------------------------|----------|------------|
194
  | ChatGPT (December 12, 2023) | - | 64.6 |