chrisociepa commited on
Commit
b0f51a9
·
verified ·
1 Parent(s): 835a197

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -221,7 +221,7 @@ Bielik-11B-v2.2-Instruct shows impressive performance on English language tasks:
221
  These results demonstrate Bielik-11B-v2.2-Instruct's versatility in both Polish and English, highlighting the effectiveness of its instruction tuning process.
222
 
223
  ### Polish MT-Bench
224
- The Bielik-11B-v.2.2-Instruct (16 bit) model was also evaluated using the MT-Bench benchmark. The quality of the model was evaluated using the English version (original version without modifications) and the Polish version created by Speakleash (tasks and evaluation in Polish, the content of the tasks was also changed to take into account the context of the Polish language).
225
 
226
  #### MT-Bench English
227
  | Model | Score |
@@ -255,7 +255,7 @@ The Bielik-11B-v.2.2-Instruct (16 bit) model was also evaluated using the MT-Ben
255
 
256
  Key observations on Bielik-11B-v2.2 performance:
257
 
258
- 1. Strong performance among mid-sized models: Bielik-11B-v2.2-Instruct scored **8.115625**, placing it ahead of several well-known models like GPT-3.5-turbo (7.868750) and Mixtral-8x7b (7.637500). This indicates that Bielik-11B-v2.2 is competitive among mid-sized models, particularly those in the 11B-70B parameter range.
259
 
260
  2. Competitive against larger models: Bielik-11B-v2.2-Instruct performs close to Meta-Llama-3.1-70B-Instruct (8.150000), Meta-Llama-3.1-405B-Instruct (8.168750) and even Mixtral-8x22b (8.231250), which have significantly more parameters. This efficiency in performance relative to size could make it an attractive option for tasks where resource constraints are a consideration. Bielik 100% generated answers in Polish, while other models (not typically trained for Polish) can answer Polish questions in English.
261
 
@@ -309,7 +309,7 @@ This benchmark provides a robust and time-efficient method for assessing LLM per
309
  | Model | MixEval | MixEval-Hard |
310
  |-------------------------------|---------|--------------|
311
  | Bielik-11B-v2.1-Instruct | 74.55 | 45.00 |
312
- | **Bielik-11B-v2.2-Instruct** | 72.35 | 39.65 |
313
  | Bielik-11B-v2.0-Instruct | 72.10 | 40.20 |
314
  | Mistral-7B-Instruct-v0.2 | 70.00 | 36.20 |
315
 
 
221
  These results demonstrate Bielik-11B-v2.2-Instruct's versatility in both Polish and English, highlighting the effectiveness of its instruction tuning process.
222
 
223
  ### Polish MT-Bench
224
+ The Bielik-11B-v2.2-Instruct (16 bit) model was also evaluated using the MT-Bench benchmark. The quality of the model was evaluated using the English version (original version without modifications) and the Polish version created by Speakleash (tasks and evaluation in Polish, the content of the tasks was also changed to take into account the context of the Polish language).
225
 
226
  #### MT-Bench English
227
  | Model | Score |
 
255
 
256
  Key observations on Bielik-11B-v2.2 performance:
257
 
258
+ 1. Strong performance among mid-sized models: Bielik-11B-v2.2-Instruct scored **8.115625**, placing it ahead of several well-known models like GPT-3.5-turbo (7.868750) and Mixtral-8x7b (7.637500). This indicates that Bielik-11B-v2.2-Instruct is competitive among mid-sized models, particularly those in the 11B-70B parameter range.
259
 
260
  2. Competitive against larger models: Bielik-11B-v2.2-Instruct performs close to Meta-Llama-3.1-70B-Instruct (8.150000), Meta-Llama-3.1-405B-Instruct (8.168750) and even Mixtral-8x22b (8.231250), which have significantly more parameters. This efficiency in performance relative to size could make it an attractive option for tasks where resource constraints are a consideration. Bielik 100% generated answers in Polish, while other models (not typically trained for Polish) can answer Polish questions in English.
261
 
 
309
  | Model | MixEval | MixEval-Hard |
310
  |-------------------------------|---------|--------------|
311
  | Bielik-11B-v2.1-Instruct | 74.55 | 45.00 |
312
+ | **Bielik-11B-v2.2-Instruct** | **72.35** | **39.65** |
313
  | Bielik-11B-v2.0-Instruct | 72.10 | 40.20 |
314
  | Mistral-7B-Instruct-v0.2 | 70.00 | 36.20 |
315