weezywitasneezy commited on
Commit
6242373
1 Parent(s): 3504971

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -11
README.md CHANGED
@@ -116,10 +116,26 @@ model-index:
116
 
117
  # BenchmarkEngineering-F2-7B-slerp
118
 
 
 
119
  BenchmarkEngineering-F2-7B-slerp is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
120
  * [weezywitasneezy/BenchmarkEngineering-7B-slerp](https://huggingface.co/weezywitasneezy/BenchmarkEngineering-7B-slerp)
121
  * [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
122
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
123
  ## 🧩 Configuration
124
 
125
  ```yaml
@@ -165,16 +181,5 @@ pipeline = transformers.pipeline(
165
  outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
166
  print(outputs[0]["generated_text"])
167
  ```
168
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
169
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_weezywitasneezy__BenchmarkEngineering-F2-7B-slerp)
170
 
171
- | Metric |Value|
172
- |---------------------------------|----:|
173
- |Avg. |75.77|
174
- |AI2 Reasoning Challenge (25-Shot)|73.46|
175
- |HellaSwag (10-Shot) |88.88|
176
- |MMLU (5-Shot) |64.50|
177
- |TruthfulQA (0-shot) |72.37|
178
- |Winogrande (5-shot) |86.11|
179
- |GSM8k (5-shot) |69.29|
180
 
 
116
 
117
  # BenchmarkEngineering-F2-7B-slerp
118
 
119
+ This merge seeks to further improve on the original BenchmarkEngineering by integrating the Westlake-7B-v2 model. It does boost the Winogrande score but at the cost of the other benchmarks.
120
+
121
  BenchmarkEngineering-F2-7B-slerp is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
122
  * [weezywitasneezy/BenchmarkEngineering-7B-slerp](https://huggingface.co/weezywitasneezy/BenchmarkEngineering-7B-slerp)
123
  * [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
124
 
125
+
126
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
127
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_weezywitasneezy__BenchmarkEngineering-F2-7B-slerp)
128
+
129
+ | Metric |Value|
130
+ |---------------------------------|----:|
131
+ |Avg. |75.77|
132
+ |AI2 Reasoning Challenge (25-Shot)|73.46|
133
+ |HellaSwag (10-Shot) |88.88|
134
+ |MMLU (5-Shot) |64.50|
135
+ |TruthfulQA (0-shot) |72.37|
136
+ |Winogrande (5-shot) |86.11|
137
+ |GSM8k (5-shot) |69.29|
138
+
139
  ## 🧩 Configuration
140
 
141
  ```yaml
 
181
  outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
182
  print(outputs[0]["generated_text"])
183
  ```
 
 
184
 
 
 
 
 
 
 
 
 
 
185