Text Generation
Transformers
PyTorch
English
gpt_neox
Inference Endpoints
text-generation-inference
10nates commited on
Commit
09a4df6
1 Parent(s): f0e0995

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +127 -6
README.md CHANGED
@@ -1,22 +1,130 @@
1
  ---
2
- license: apache-2.0
3
  language:
4
  - en
 
5
  datasets:
6
  - togethercomputer/RedPajama-Data-1T
7
  - OpenAssistant/oasst1
8
  - databricks/databricks-dolly-15k
9
  widget:
10
- - text: "<human>: Write an email to my friends inviting them to come to my home on Friday for a dinner party, bring their own food to share.\n<bot>:"
11
- example_title: "Email Writing"
12
- - text: "<human>: Create a list of things to do in San Francisco\n<bot>:"
13
- example_title: "Brainstorming"
 
 
 
 
 
14
  inference:
15
  parameters:
16
  temperature: 0.7
17
  top_p: 0.7
18
  top_k: 50
19
  max_new_tokens: 128
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ---
21
 
22
  # RedPajama-INCITE-Chat-3B-v1
@@ -208,4 +316,17 @@ Please refer to [togethercomputer/RedPajama-Data-1T](https://huggingface.co/data
208
 
209
  ## Community
210
 
211
- Join us on [Together Discord](https://discord.gg/6ZVDU8tTD4)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - en
4
+ license: apache-2.0
5
  datasets:
6
  - togethercomputer/RedPajama-Data-1T
7
  - OpenAssistant/oasst1
8
  - databricks/databricks-dolly-15k
9
  widget:
10
+ - text: '<human>: Write an email to my friends inviting them to come to my home on
11
+ Friday for a dinner party, bring their own food to share.
12
+
13
+ <bot>:'
14
+ example_title: Email Writing
15
+ - text: '<human>: Create a list of things to do in San Francisco
16
+
17
+ <bot>:'
18
+ example_title: Brainstorming
19
  inference:
20
  parameters:
21
  temperature: 0.7
22
  top_p: 0.7
23
  top_k: 50
24
  max_new_tokens: 128
25
+ model-index:
26
+ - name: RedPajama-INCITE-Chat-3B-v1
27
+ results:
28
+ - task:
29
+ type: text-generation
30
+ name: Text Generation
31
+ dataset:
32
+ name: AI2 Reasoning Challenge (25-Shot)
33
+ type: ai2_arc
34
+ config: ARC-Challenge
35
+ split: test
36
+ args:
37
+ num_few_shot: 25
38
+ metrics:
39
+ - type: acc_norm
40
+ value: 42.83
41
+ name: normalized accuracy
42
+ source:
43
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=togethercomputer/RedPajama-INCITE-Chat-3B-v1
44
+ name: Open LLM Leaderboard
45
+ - task:
46
+ type: text-generation
47
+ name: Text Generation
48
+ dataset:
49
+ name: HellaSwag (10-Shot)
50
+ type: hellaswag
51
+ split: validation
52
+ args:
53
+ num_few_shot: 10
54
+ metrics:
55
+ - type: acc_norm
56
+ value: 67.62
57
+ name: normalized accuracy
58
+ source:
59
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=togethercomputer/RedPajama-INCITE-Chat-3B-v1
60
+ name: Open LLM Leaderboard
61
+ - task:
62
+ type: text-generation
63
+ name: Text Generation
64
+ dataset:
65
+ name: MMLU (5-Shot)
66
+ type: cais/mmlu
67
+ config: all
68
+ split: test
69
+ args:
70
+ num_few_shot: 5
71
+ metrics:
72
+ - type: acc
73
+ value: 26.23
74
+ name: accuracy
75
+ source:
76
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=togethercomputer/RedPajama-INCITE-Chat-3B-v1
77
+ name: Open LLM Leaderboard
78
+ - task:
79
+ type: text-generation
80
+ name: Text Generation
81
+ dataset:
82
+ name: TruthfulQA (0-shot)
83
+ type: truthful_qa
84
+ config: multiple_choice
85
+ split: validation
86
+ args:
87
+ num_few_shot: 0
88
+ metrics:
89
+ - type: mc2
90
+ value: 34.44
91
+ source:
92
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=togethercomputer/RedPajama-INCITE-Chat-3B-v1
93
+ name: Open LLM Leaderboard
94
+ - task:
95
+ type: text-generation
96
+ name: Text Generation
97
+ dataset:
98
+ name: Winogrande (5-shot)
99
+ type: winogrande
100
+ config: winogrande_xl
101
+ split: validation
102
+ args:
103
+ num_few_shot: 5
104
+ metrics:
105
+ - type: acc
106
+ value: 65.51
107
+ name: accuracy
108
+ source:
109
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=togethercomputer/RedPajama-INCITE-Chat-3B-v1
110
+ name: Open LLM Leaderboard
111
+ - task:
112
+ type: text-generation
113
+ name: Text Generation
114
+ dataset:
115
+ name: GSM8k (5-shot)
116
+ type: gsm8k
117
+ config: main
118
+ split: test
119
+ args:
120
+ num_few_shot: 5
121
+ metrics:
122
+ - type: acc
123
+ value: 0.53
124
+ name: accuracy
125
+ source:
126
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=togethercomputer/RedPajama-INCITE-Chat-3B-v1
127
+ name: Open LLM Leaderboard
128
  ---
129
 
130
  # RedPajama-INCITE-Chat-3B-v1
 
316
 
317
  ## Community
318
 
319
+ Join us on [Together Discord](https://discord.gg/6ZVDU8tTD4)
320
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
321
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_togethercomputer__RedPajama-INCITE-Chat-3B-v1)
322
+
323
+ | Metric |Value|
324
+ |---------------------------------|----:|
325
+ |Avg. |39.53|
326
+ |AI2 Reasoning Challenge (25-Shot)|42.83|
327
+ |HellaSwag (10-Shot) |67.62|
328
+ |MMLU (5-Shot) |26.23|
329
+ |TruthfulQA (0-shot) |34.44|
330
+ |Winogrande (5-shot) |65.51|
331
+ |GSM8k (5-shot) | 0.53|
332
+