bindureddy commited on
Commit
b8d8e09
1 Parent(s): dad07fd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -81,22 +81,23 @@ We evaluated checkpoint 1000 ([abacusai/Liberated-Qwen1.5-72B-c1000](https://hug
81
  score
82
  model turn
83
  Liberated-Qwen-1.5-72b-ckpt1000 1 8.45000
84
- Smaug-72B-v0.1 1 8.21250
 
85
 
86
  ########## Second turn ##########
87
  score
88
  model turn
 
89
  Liberated-Qwen-1.5-72b-ckpt1000 2 7.65000
90
- Smaug-72B-v0.1 2 7.20625
91
 
92
  ########## Average ##########
93
  score
94
  model
 
95
  Liberated-Qwen-1.5-72b-ckpt1000 8.050000
96
- Smaug-72B-v0.1 7.709375
97
- ```
98
 
99
- Smaug has a higher leaderboard average score, but it appears that this new dataset does significantly help with instruction following.
100
 
101
  The model does preserve good performance on MMLU = 77.13.
102
 
 
81
  score
82
  model turn
83
  Liberated-Qwen-1.5-72b-ckpt1000 1 8.45000
84
+ Qwen1.5-72B-Chat 1 8.44375
85
+
86
 
87
  ########## Second turn ##########
88
  score
89
  model turn
90
+ Qwen1.5-72B-Chat 2 8.23750
91
  Liberated-Qwen-1.5-72b-ckpt1000 2 7.65000
92
+
93
 
94
  ########## Average ##########
95
  score
96
  model
97
+ Qwen1.5-72B-Chat 8.340625
98
  Liberated-Qwen-1.5-72b-ckpt1000 8.050000
 
 
99
 
100
+ ```
101
 
102
  The model does preserve good performance on MMLU = 77.13.
103