chrisociepa commited on
Commit
d2ae294
1 Parent(s): 276bc97

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -7
README.md CHANGED
@@ -21,6 +21,15 @@ and more precisely, the HPC center: ACK Cyfronet AGH. The creation and training
21
  enabling the use of cutting-edge technology and computational resources essential for large-scale machine learning processes. As a result, the model exhibits an exceptional ability to understand and process the Polish language,
22
  providing accurate responses and performing a variety of linguistic tasks with high precision.
23
 
 
 
 
 
 
 
 
 
 
24
  ## Model
25
 
26
  Bielik-11B-v2 has been trained with [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) using different parallelization techniques.
@@ -83,12 +92,13 @@ Generated output:
83
 
84
  ## Evaluation
85
 
86
-
87
- Models have been evaluated on [Open PL LLM Leaderboard](https://huggingface.co/spaces/speakleash/open_pl_llm_leaderboard) 5-shot. The benchmark evaluates models in NLP tasks like sentiment analysis, categorization, text classification but does not test chatting skills. Average column is an average score among all tasks normalized by baseline scores.
88
 
89
  ### Open PL LLM Leaderboard
90
 
91
- | Model | Parameters | Average |
 
 
92
  |------------------------|------------|---------|
93
  | Qwen2-72B | 72 | 65.76 |
94
  | Meta-Llama-3-70B | 70 | 60.87 |
@@ -114,10 +124,10 @@ Other Polish models listed include Qra-13b and Qra-7b, scoring 33.71 and 16.09 r
114
 
115
  Additionally, the Bielik-11B-v2 was initialized from the weights of Mistral-7B-v0.2, which itself scored 37.20, further demonstrating the effective enhancements incorporated into the Bielik-11B-v2 model.
116
 
117
-
118
-
119
  ### Open LLM Leaderboard
120
 
 
 
121
  | Model | AVG | arc_challenge | hellaswag | truthfulqa_mc2 | mmlu | winogrande | gsm8k |
122
  |-------------------------|-------|---------------|-----------|----------------|-------|------------|-------|
123
  | **Bielik-11B-v2** | **65.87** | 60.58 | 79.84 | 46.13 | 63.06 | 77.82 | 67.78 |
@@ -149,7 +159,7 @@ The model is licensed under Apache 2.0, which allows for commercial use.
149
  Please cite this model using the following format:
150
 
151
  ```
152
- @misc{Bielik7Bv01,
153
  title = {Bielik-11B-v2 model card},
154
  author = {Ociepa, Krzysztof and Flis, Łukasz and Wróbel, Krzysztof and Gwoździej, Adrian and {SpeakLeash Team} and {Cyfronet Team}},
155
  year = {2024},
@@ -157,6 +167,11 @@ Please cite this model using the following format:
157
  note = {Accessed: 2024-08-28},
158
  urldate = {2024-08-28}
159
  }
 
 
 
 
 
160
  ```
161
 
162
  ## Responsible for training the model
@@ -181,4 +196,4 @@ Members of the ACK Cyfronet AGH team providing valuable support and expertise:
181
 
182
  ## Contact Us
183
 
184
- If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our [Discord SpeakLeash](https://discord.com/invite/TunEeCTw).
 
21
  enabling the use of cutting-edge technology and computational resources essential for large-scale machine learning processes. As a result, the model exhibits an exceptional ability to understand and process the Polish language,
22
  providing accurate responses and performing a variety of linguistic tasks with high precision.
23
 
24
+ ⚠️ This is a base model intended for further fine-tuning across most use cases. If you're looking for a model ready for chatting or following instructions out-of-the-box, please use [Bielik-11B-v.2.2-Instruct](https://huggingface.co/speakleash/Bielik-11B-v2.2-Instruct).
25
+
26
+ 🎥 Demo: https://chat.bielik.ai
27
+
28
+ 🗣️ Chat Arena<span style="color:red;">*</span>: https://arena.speakleash.org.pl/
29
+
30
+ <span style="color:red;">*</span>Chat Arena is a platform for testing and comparing different AI language models, allowing users to evaluate their performance and quality.
31
+
32
+
33
  ## Model
34
 
35
  Bielik-11B-v2 has been trained with [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) using different parallelization techniques.
 
92
 
93
  ## Evaluation
94
 
95
+ Models have been evaluated on two leaderboards: [Open PL LLM Leaderboard](https://huggingface.co/spaces/speakleash/open_pl_llm_leaderboard) and [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). The Open PL LLM Leaderboard uses a 5-shot evaluation and focuses on NLP tasks in Polish, while the Open LLM Leaderboard evaluates models on various English language tasks.
 
96
 
97
  ### Open PL LLM Leaderboard
98
 
99
+ The benchmark evaluates models in NLP tasks like sentiment analysis, categorization, text classification but does not test chatting skills. Average column is an average score among all tasks normalized by baseline scores.
100
+
101
+ | Model | Parameters (B) | Average |
102
  |------------------------|------------|---------|
103
  | Qwen2-72B | 72 | 65.76 |
104
  | Meta-Llama-3-70B | 70 | 60.87 |
 
124
 
125
  Additionally, the Bielik-11B-v2 was initialized from the weights of Mistral-7B-v0.2, which itself scored 37.20, further demonstrating the effective enhancements incorporated into the Bielik-11B-v2 model.
126
 
 
 
127
  ### Open LLM Leaderboard
128
 
129
+ The Open LLM Leaderboard evaluates models on various English language tasks, providing insights into the model's performance across different linguistic challenges.
130
+
131
  | Model | AVG | arc_challenge | hellaswag | truthfulqa_mc2 | mmlu | winogrande | gsm8k |
132
  |-------------------------|-------|---------------|-----------|----------------|-------|------------|-------|
133
  | **Bielik-11B-v2** | **65.87** | 60.58 | 79.84 | 46.13 | 63.06 | 77.82 | 67.78 |
 
159
  Please cite this model using the following format:
160
 
161
  ```
162
+ @misc{Bielik11Bv2b,
163
  title = {Bielik-11B-v2 model card},
164
  author = {Ociepa, Krzysztof and Flis, Łukasz and Wróbel, Krzysztof and Gwoździej, Adrian and {SpeakLeash Team} and {Cyfronet Team}},
165
  year = {2024},
 
167
  note = {Accessed: 2024-08-28},
168
  urldate = {2024-08-28}
169
  }
170
+ @unpublished{Bielik11Bv2a,
171
+ author = {Ociepa, Krzysztof and Flis, Łukasz and Kinas, Remigiusz and Gwoździej, Adrian and Wróbel, Krzysztof},
172
+ title = {Bielik: A Family of Large Language Models for the Polish Language – Development, Insights, and Evaluation},
173
+ year = {2024},
174
+ }
175
  ```
176
 
177
  ## Responsible for training the model
 
196
 
197
  ## Contact Us
198
 
199
+ If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our [Discord SpeakLeash](https://discord.com/invite/TunEeCTw).