leaderboard-pr-bot commited on
Commit
dda322f
1 Parent(s): 76e5b9c

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +21 -8
README.md CHANGED
@@ -1,20 +1,20 @@
1
  ---
 
 
2
  license: other
3
- license_name: gemma-terms-of-use
4
- license_link: https://ai.google.dev/gemma/terms
5
  library_name: transformers
6
- base_model: google/gemma-2b
7
  tags:
8
  - trl
9
  - orpo
10
  - generated_from_trainer
 
 
 
 
 
11
  model-index:
12
  - name: gemma-2b-orpo
13
  results: []
14
- datasets:
15
- - alvarobartt/dpo-mix-7k-simplified
16
- language:
17
- - en
18
  ---
19
 
20
  <img src="./assets/gemma-2b-orpo.png" width="450"></img>
@@ -77,4 +77,17 @@ The model was trained using HF TRL.
77
  - Transformers 4.39.1
78
  - Pytorch 2.2.0+cu121
79
  - Datasets 2.18.0
80
- - Tokenizers 0.15.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: other
 
 
5
  library_name: transformers
 
6
  tags:
7
  - trl
8
  - orpo
9
  - generated_from_trainer
10
+ base_model: google/gemma-2b
11
+ datasets:
12
+ - alvarobartt/dpo-mix-7k-simplified
13
+ license_name: gemma-terms-of-use
14
+ license_link: https://ai.google.dev/gemma/terms
15
  model-index:
16
  - name: gemma-2b-orpo
17
  results: []
 
 
 
 
18
  ---
19
 
20
  <img src="./assets/gemma-2b-orpo.png" width="450"></img>
 
77
  - Transformers 4.39.1
78
  - Pytorch 2.2.0+cu121
79
  - Datasets 2.18.0
80
+ - Tokenizers 0.15.2
81
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
82
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_anakin87__gemma-2b-orpo)
83
+
84
+ | Metric |Value|
85
+ |---------------------------------|----:|
86
+ |Avg. |47.35|
87
+ |AI2 Reasoning Challenge (25-Shot)|49.15|
88
+ |HellaSwag (10-Shot) |73.72|
89
+ |MMLU (5-Shot) |38.52|
90
+ |TruthfulQA (0-shot) |44.53|
91
+ |Winogrande (5-shot) |64.33|
92
+ |GSM8k (5-shot) |13.87|
93
+