speakleash
/

Bielik-11B-v2

@@ -21,6 +21,15 @@ and more precisely, the HPC center: ACK Cyfronet AGH. The creation and training
 enabling the use of cutting-edge technology and computational resources essential for large-scale machine learning processes. As a result, the model exhibits an exceptional ability to understand and process the Polish language,
 providing accurate responses and performing a variety of linguistic tasks with high precision.
 ## Model
 Bielik-11B-v2 has been trained with [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) using different parallelization techniques.
@@ -83,12 +92,13 @@ Generated output:
 ## Evaluation
-Models have been evaluated on [Open PL LLM Leaderboard](https://huggingface.co/spaces/speakleash/open_pl_llm_leaderboard) 5-shot. The benchmark evaluates models in NLP tasks like sentiment analysis, categorization, text classification but does not test chatting skills. Average column is an average score among all tasks normalized by baseline scores.
 ### Open PL LLM Leaderboard
-| Model                  | Parameters | Average |
 |------------------------|------------|---------|
 | Qwen2-72B              | 72         | 65.76   |
 | Meta-Llama-3-70B       | 70         | 60.87   |
@@ -114,10 +124,10 @@ Other Polish models listed include Qra-13b and Qra-7b, scoring 33.71 and 16.09 r
 Additionally, the Bielik-11B-v2 was initialized from the weights of Mistral-7B-v0.2, which itself scored 37.20, further demonstrating the effective enhancements incorporated into the Bielik-11B-v2 model.
 ### Open LLM Leaderboard
 | Model                   | AVG   | arc_challenge | hellaswag | truthfulqa_mc2 | mmlu  | winogrande | gsm8k |
 |-------------------------|-------|---------------|-----------|----------------|-------|------------|-------|
 | **Bielik-11B-v2**       | **65.87** | 60.58         | 79.84     | 46.13          | 63.06 | 77.82      | 67.78 |
@@ -149,7 +159,7 @@ The model is licensed under Apache 2.0, which allows for commercial use.
 Please cite this model using the following format:
 ```
-@misc{Bielik7Bv01,
     title     = {Bielik-11B-v2 model card},
     author    = {Ociepa, Krzysztof and Flis, Łukasz and Wróbel, Krzysztof and Gwoździej, Adrian and {SpeakLeash Team} and {Cyfronet Team}},
     year      = {2024},
@@ -157,6 +167,11 @@ Please cite this model using the following format:
     note      = {Accessed: 2024-08-28},
     urldate   = {2024-08-28}
 }
 ```
 ## Responsible for training the model
@@ -181,4 +196,4 @@ Members of the ACK Cyfronet AGH team providing valuable support and expertise:
 ## Contact Us
-If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our [Discord SpeakLeash](https://discord.com/invite/TunEeCTw).

 enabling the use of cutting-edge technology and computational resources essential for large-scale machine learning processes. As a result, the model exhibits an exceptional ability to understand and process the Polish language,
 providing accurate responses and performing a variety of linguistic tasks with high precision.
+⚠️ This is a base model intended for further fine-tuning across most use cases. If you're looking for a model ready for chatting or following instructions out-of-the-box, please use [Bielik-11B-v.2.2-Instruct](https://huggingface.co/speakleash/Bielik-11B-v2.2-Instruct).
+🎥 Demo: https://chat.bielik.ai
+🗣️ Chat Arena<span style="color:red;">*</span>: https://arena.speakleash.org.pl/
+<span style="color:red;">*</span>Chat Arena is a platform for testing and comparing different AI language models, allowing users to evaluate their performance and quality.
 ## Model
 Bielik-11B-v2 has been trained with [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) using different parallelization techniques.
 ## Evaluation
+Models have been evaluated on two leaderboards: [Open PL LLM Leaderboard](https://huggingface.co/spaces/speakleash/open_pl_llm_leaderboard) and [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). The Open PL LLM Leaderboard uses a 5-shot evaluation and focuses on NLP tasks in Polish, while the Open LLM Leaderboard evaluates models on various English language tasks.
 ### Open PL LLM Leaderboard
+The benchmark evaluates models in NLP tasks like sentiment analysis, categorization, text classification but does not test chatting skills. Average column is an average score among all tasks normalized by baseline scores.
+| Model                  | Parameters (B) | Average |
 |------------------------|------------|---------|
 | Qwen2-72B              | 72         | 65.76   |
 | Meta-Llama-3-70B       | 70         | 60.87   |
 Additionally, the Bielik-11B-v2 was initialized from the weights of Mistral-7B-v0.2, which itself scored 37.20, further demonstrating the effective enhancements incorporated into the Bielik-11B-v2 model.
 ### Open LLM Leaderboard
+The Open LLM Leaderboard evaluates models on various English language tasks, providing insights into the model's performance across different linguistic challenges.
 | Model                   | AVG   | arc_challenge | hellaswag | truthfulqa_mc2 | mmlu  | winogrande | gsm8k |
 |-------------------------|-------|---------------|-----------|----------------|-------|------------|-------|
 | **Bielik-11B-v2**       | **65.87** | 60.58         | 79.84     | 46.13          | 63.06 | 77.82      | 67.78 |
 Please cite this model using the following format:
 ```
+@misc{Bielik11Bv2b,
     title     = {Bielik-11B-v2 model card},
     author    = {Ociepa, Krzysztof and Flis, Łukasz and Wróbel, Krzysztof and Gwoździej, Adrian and {SpeakLeash Team} and {Cyfronet Team}},
     year      = {2024},
     note      = {Accessed: 2024-08-28},
     urldate   = {2024-08-28}
 }
+@unpublished{Bielik11Bv2a,
+  author = {Ociepa, Krzysztof and Flis, Łukasz and Kinas, Remigiusz and Gwoździej, Adrian and Wróbel, Krzysztof},
+  title  = {Bielik: A Family of Large Language Models for the Polish Language – Development, Insights, and Evaluation},
+  year   = {2024},
+}
 ```
 ## Responsible for training the model
 ## Contact Us
+If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our [Discord SpeakLeash](https://discord.com/invite/TunEeCTw).