leaderboard-pr-bot commited on
Commit
6983707
1 Parent(s): 2adbdae

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -10,7 +10,6 @@ tags:
10
  - finetune
11
  - chatml
12
  base_model: Qwen/Qwen2-72B-Instruct
13
- model_name: MaziyarPanahi/calme-2.1-qwen2-72b
14
  license_name: tongyi-qianwen
15
  license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
16
  pipeline_tag: text-generation
@@ -210,3 +209,17 @@ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.1-qwen2-72b"
210
  # Ethical Considerations
211
 
212
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - finetune
11
  - chatml
12
  base_model: Qwen/Qwen2-72B-Instruct
 
13
  license_name: tongyi-qianwen
14
  license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
15
  pipeline_tag: text-generation
 
209
  # Ethical Considerations
210
 
211
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
212
+
213
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
214
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-2.1-qwen2-72b)
215
+
216
+ | Metric |Value|
217
+ |-------------------|----:|
218
+ |Avg. |43.61|
219
+ |IFEval (0-Shot) |81.63|
220
+ |BBH (3-Shot) |57.33|
221
+ |MATH Lvl 5 (4-Shot)|36.03|
222
+ |GPQA (0-shot) |17.45|
223
+ |MuSR (0-shot) |20.15|
224
+ |MMLU-PRO (5-shot) |49.05|
225
+