Adding Evaluation Results

#3
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -13,4 +13,17 @@ This is an experimental mixed model containing a parameter-wise 50/50 blend (wei
13
  This improves on earlier model mixing techniques by only applying the merge to the layers containing tensors of the same dimensions.
14
  By selectively skipping merge operations on the input and output layers, we are now able to merge models with different vocab sizes (i.e. added tokens) so long as the hidden layers have identical sizes.
15
 
16
- All feedback and comments can be directed to Concedo on the KoboldAI discord.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  This improves on earlier model mixing techniques by only applying the merge to the layers containing tensors of the same dimensions.
14
  By selectively skipping merge operations on the input and output layers, we are now able to merge models with different vocab sizes (i.e. added tokens) so long as the hidden layers have identical sizes.
15
 
16
+ All feedback and comments can be directed to Concedo on the KoboldAI discord.
17
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
18
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_concedo__Vicuzard-30B-Uncensored)
19
+
20
+ | Metric | Value |
21
+ |-----------------------|---------------------------|
22
+ | Avg. | 53.76 |
23
+ | ARC (25-shot) | 62.97 |
24
+ | HellaSwag (10-shot) | 83.68 |
25
+ | MMLU (5-shot) | 58.16 |
26
+ | TruthfulQA (0-shot) | 52.27 |
27
+ | Winogrande (5-shot) | 77.11 |
28
+ | GSM8K (5-shot) | 15.39 |
29
+ | DROP (3-shot) | 26.76 |