siddartha-abacus commited on
Commit
0add2cb
1 Parent(s): 15f01ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -2
README.md CHANGED
@@ -74,7 +74,33 @@ Please generate a Advanced Dungeons & Dragons 2nd Edition character sheet for a
74
 
75
  ## Evals
76
 
77
- TBD
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
 
79
  ## Future Plans
80
- This model will be released on the whole Qwen-1.5 series.
 
 
 
74
 
75
  ## Evals
76
 
77
+ We evaluated checkpoint 1000 ((abacusai/Liberated-Qwen1.5-72B-c1000)[https://huggingface.co/abacusai/Liberated-Qwen1.5-72B-c1000]) from this training run against MT Bench:
78
+
79
+ ```
80
+ ########## First turn ##########
81
+ score
82
+ model turn
83
+ Liberated-Qwen-1.5-72b-ckpt1000 1 8.45000
84
+ Smaug-72B-v0.1 1 8.21250
85
+
86
+ ########## Second turn ##########
87
+ score
88
+ model turn
89
+ Liberated-Qwen-1.5-72b-ckpt1000 2 7.65000
90
+ Smaug-72B-v0.1 2 7.20625
91
+
92
+ ########## Average ##########
93
+ score
94
+ model
95
+ Liberated-Qwen-1.5-72b-ckpt1000 8.050000
96
+ Smaug-72B-v0.1 7.709375
97
+ ```
98
+
99
+ Smaug has a higher leaderboard average score, but it appears that this new dataset does significantly help with instruction following.
100
+
101
+ The model does preserve good performance on MMLU = 77.13.
102
 
103
  ## Future Plans
104
+ This model will be released on the whole Qwen-1.5 series.
105
+
106
+ Future releases will also focus on mixing this dataset with the datasets used to train Smaug to combine properties of both models.