amr-mohamed commited on
Commit
4626cbb
1 Parent(s): 1ed7b0f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -451
README.md CHANGED
@@ -311,446 +311,6 @@ Our training dataset [Darija-SFT-Mixture](https://huggingface.co/datasets/MBZUAI
311
  Atlas-Chat models are based on Gemma 2 models. The Atlas-Chat models were trained using 8 Nvidia's A100 80 GB GPUs in parallel using FSDP on AWS Sagemaker. The model is trained using HuggingFace transformers and parameter-efficient fine-tuning with LoRA rank of 256.
312
 
313
 
314
- <!--
315
- ## Evaluation
316
- The Atlas-Chat models were evaluated on a comprehensive suite of tasks using various datasets and benchmarks to assess their performance across multiple dimensions. These included tasks such as:
317
-
318
- * **DarijaMMLU:** A Darija version of ArabicMMLU and MMLU benchmarks translated from MSA and English respectively.
319
- * **DarijaHellaSwag:** A Darija version of HellaSwag.
320
- * **Belebele Ary_Arab:** Belebele is a multiple-choice machine reading comprehension dataset published by Facebook spanning 122 language variants. The Evaluation is done on the Ary_Arab part of Belebele that refers to Darija.
321
- * **Sentiment Analysis.**
322
- * **Translation:** Including six directions and four languages: Darija, MSA, English and French.
323
- * **Transliteration:** Transforming a sentence from Darija (written in Arabic characters) to Arabizi (Written in Latin characters) and vice-versa.
324
- * **Summarization.**
325
-
326
- The models were compared against a collection of existing open-source Arabic models to gauge their effectiveness, with a particular focus on performance in Darija. All scores are based on zero-shot performance. The prompts are written mainly in Darija. The metric used for DarijaMMLU, DarijaHellaSwag, Belebele Ary and Sentiment Analysis is the normalized accuracy. We used [Language Model Evaluation Harness](https://github.com/MBZUAI-Paris/lm-evaluation-harness-atlas-chat) to conduct these evaluations.
327
-
328
-
329
- **LLMs Benchmarks:**
330
- <table>
331
- <tr>
332
- <td>Model</td>
333
- <td><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaMMLU" target="_blank">DarijaMMLU</a></td>
334
- <td><a href="MBZUAI-Paris/DarijaHellaSwag" target="_blank">DarijaHellaSwag</a></td>
335
- <td ><a href="https://huggingface.co/datasets/facebook/belebele/viewer/ary_Arab" target="_blank">Belebele Ary</a></td>
336
- </tr>
337
- <tr>
338
- <td><a href="https://huggingface.co/inceptionai/jais-family-1p3b-chat" target="_blank">jais-family-1p3b-chat</a></td>
339
- <td>35.39</td>
340
- <td>32.51</td>
341
- <td>38.33</td>
342
- </tr>
343
- <tr>
344
- <td><a href="https://huggingface.co/inceptionai/jais-family-2p7b-chat" target="_blank">jais-family-2p7b-chat</a></td>
345
- <td>37.44</td>
346
- <td>34.49</td>
347
- <td>44.11</td>
348
- </tr>
349
- <tr>
350
- <td><a href="https://huggingface.co/google/gemma-2-2b-it" target="_blank">gemma-2-2b-it</a></td>
351
- <td>28.58</td>
352
- <td>32.42</td>
353
- <td>25.22</td>
354
- </tr>
355
- <tr>
356
- <td><a href="meta-llama/Llama-3.2-1B-Instruct" target="_blank">Llama-3.2-1B-Instruct</a></td>
357
- <td>27.66</td>
358
- <td>26.88</td>
359
- <td>28.89</td>
360
- </tr>
361
- <tr>
362
- <td><a href="meta-llama/Llama-3.2-3B-Instruct" target="_blank">Llama-3.2-3B-Instruct</a></td>
363
- <td>32.60</td>
364
- <td>28.33</td>
365
- <td>38.00</td>
366
- </tr>
367
- <tr>
368
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-2B" target="_blank">Atlas-Chat-2B</a></strong></td>
369
- <td><b>44.97</td>
370
- <td><b>41.48</td>
371
- <td><b>53.89</td>
372
- </tr>
373
- <tr style="border-top: 4px solid;"></tr>
374
- <tr>
375
- <td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
376
- <td>39.96</td>
377
- <td>41.57</td>
378
- <td>51.22</td>
379
- </tr>
380
- <tr>
381
- <td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
382
- <td>39.30</td>
383
- <td>35.19</td>
384
- <td>43.67</td>
385
- </tr>
386
- <tr>
387
- <td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
388
- <td>45.11</td>
389
- <td>43.90</td>
390
- <td>58.67</td>
391
- </tr>
392
- <tr>
393
- <td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
394
- <td>45.20</td>
395
- <td>40.65</td>
396
- <td>49.67</td>
397
- </tr>
398
- <tr>
399
- <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat" target="_blank">AceGPT-7b-chat</a></td>
400
- <td>35.98</td>
401
- <td>36.57</td>
402
- <td>30.11</td>
403
- </tr>
404
- <tr>
405
- <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-13B-chat" target="_blank">AceGPT-13b-chat</a></td>
406
- <td>41.09</td>
407
- <td>38.35</td>
408
- <td>33.11</td>
409
- </tr>
410
- <tr>
411
- <td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
412
- <td>35.91</td>
413
- <td>42.43</td>
414
- <td>31.00</td>
415
- </tr>
416
- <tr>
417
- <td><a href="meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
418
- <td>44.13</td>
419
- <td>38.24</td>
420
- <td>47.00</td>
421
- </tr>
422
- <tr>
423
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-9B" target="_blank">Atlas-Chat-9B</a></strong></td>
424
- <td><b>58.23</td>
425
- <td><b>57.75</td>
426
- <td><b>74.56</td>
427
- </tr>
428
- <tr style="border-top: 4px solid;"></tr>
429
- <tr>
430
- <td><a href="https://huggingface.co/inceptionai/jais-family-30b-8k-chat" target="_blank">jais-family-30b-8k-chat</a></td>
431
- <td>51.88</td>
432
- <td>35.61</td>
433
- <td>65.67</td>
434
- </tr>
435
- <tr>
436
- <td><a href="https://huggingface.co/google/gemma-2-27b-it" target="_blank">gemma-2-27b-it</a></td>
437
- <td>36.47</td>
438
- <td>37.04</td>
439
- <td>35.78</td>
440
- </tr>
441
- <tr>
442
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-27B" target="_blank">Atlas-Chat-27B</a></strong></td>
443
- <td><b>61.95</td>
444
- <td><b>48.37</td>
445
- <td><b>75.67</td>
446
- </tr>
447
-
448
-
449
-
450
- </table>
451
-
452
- **Standard NLP Tasks:**
453
- <table>
454
- <tr>
455
- <td rowspan="2">Model</td>
456
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">DODa-10k (Translation)</a></td>
457
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">MADAR (Translation)</a></td>
458
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">FLORES+ (Translation)</a></td>
459
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">NLLB-Seed (Translation)</a></td>
460
- <td colspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">DODa-10k (Transliteration)</a></td>
461
- <td rowspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">MArSum (Summarization)</a><br/>(LLM as a judge)</td>
462
- <td rowspan="2"><a href="https://huggingface.co/datasets/MBZUAI-Paris/DarijaBench" target="_blank">Sentiment Analysis</a></td>
463
- </tr>
464
- <tr>
465
- <td>BLEU</td>
466
- <td>chrF</td>
467
- <td>BLEU</td>
468
- <td>chrF</td>
469
- <td>BLEU</td>
470
- <td>chrF</td>
471
- <td>BLEU</td>
472
- <td>chrF</td>
473
- <td>BLEU</td>
474
- <td>chrF</td>
475
- </tr>
476
- <tr>
477
- <td><a href="https://huggingface.co/inceptionai/jais-family-1p3b-chat" target="_blank">jais-family-1p3b-chat</a></td>
478
- <td>00.13</td>
479
- <td>06.18</td>
480
- <td>00.50</td>
481
- <td>15.43</td>
482
- <td>02.44</td>
483
- <td>19.14</td>
484
- <td>01.99</td>
485
- <td>12.60</td>
486
- <td>00.01</td>
487
- <td>03.01</td>
488
- <td>00.50</td>
489
- <td>45.29</td>
490
- </tr>
491
- <tr>
492
- <td><a href="https://huggingface.co/inceptionai/jais-family-2p7b-chat" target="_blank">jais-family-2p7b-chat</a></td>
493
- <td>00.25</td>
494
- <td>07.46</td>
495
- <td>00.62</td>
496
- <td>16.36</td>
497
- <td>04.25</td>
498
- <td>18.22</td>
499
- <td>03.10</td>
500
- <td>08.19</td>
501
- <td>00.01</td>
502
- <td>03.27</td>
503
- <td>00.90</td>
504
- <td>51.56</td>
505
- </tr>
506
- <tr>
507
- <td><a href="https://huggingface.co/google/gemma-2-2b-it" target="_blank">gemma-2-2b-it</a></td>
508
- <td>00.10</td>
509
- <td>04.96</td>
510
- <td>00.12</td>
511
- <td>06.66</td>
512
- <td>01.55</td>
513
- <td>18.59</td>
514
- <td>02.78</td>
515
- <td>23.69</td>
516
- <td>00.01</td>
517
- <td>02.08</td>
518
- <td>06.80</td>
519
- <td>53.36</td>
520
- </tr>
521
- <tr>
522
- <td><a href="meta-llama/Llama-3.2-1B-Instruct" target="_blank">Llama-3.2-1B-Instruct</a></td>
523
- <td>00.07</td>
524
- <td>05.95</td>
525
- <td>00.80</td>
526
- <td>18.71</td>
527
- <td>04.53</td>
528
- <td>18.39</td>
529
- <td>04.52</td>
530
- <td>17.06</td>
531
- <td>00.02</td>
532
- <td>03.74</td>
533
- <td>08.23</td>
534
- <td>46.27</td>
535
- </tr>
536
- <tr>
537
- <td><a href="meta-llama/Llama-3.2-3B-Instruct" target="_blank">Llama-3.2-3B-Instruct</a></td>
538
- <td>00.62</td>
539
- <td>13.67</td>
540
- <td>01.18</td>
541
- <td>22.12</td>
542
- <td>08.59</td>
543
- <td>35.21</td>
544
- <td>13.75</td>
545
- <td>43.63</td>
546
- <td>00.21</td>
547
- <td>09.68</td>
548
- <td>08.23</td>
549
- <td>49.20</td>
550
- </tr>
551
- <tr>
552
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-2B" target="_blank">Atlas-Chat-2B</a></strong></td>
553
- <td><b>22.76</td>
554
- <td><b>44.86</td>
555
- <td><b>16.67</td>
556
- <td><b>41.64</td>
557
- <td><b>14.92</td>
558
- <td><b>43.03</td>
559
- <td><b>23.88</td>
560
- <td><b>52.19</td>
561
- <td><b>08.18</td>
562
- <td><b>21.54</td>
563
- <td><b>55.22</td>
564
- <td><b>73.99</td>
565
- </tr>
566
- <tr style="border-top: 4px solid;"></tr>
567
- <tr>
568
- <td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
569
- <td>00.73</td>
570
- <td>11.85</td>
571
- <td>01.88</td>
572
- <td>23.22</td>
573
- <td>04.25</td>
574
- <td>18.22</td>
575
- <td>04.62</td>
576
- <td>20.22</td>
577
- <td>00.02</td>
578
- <td>03.79</td>
579
- <td>03.02</td>
580
- <td>56.78</td>
581
- </tr>
582
- <tr>
583
- <td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
584
- <td>00.60</td>
585
- <td>09.43</td>
586
- <td>03.45</td>
587
- <td>25.88</td>
588
- <td>07.25</td>
589
- <td>23.21</td>
590
- <td>01.25</td>
591
- <td>02.22</td>
592
- <td>00.04</td>
593
- <td>03.24</td>
594
- <td>02.82</td>
595
- <td>52.72</td>
596
- </tr>
597
- <tr>
598
- <td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
599
- <td>00.92</td>
600
- <td>11.71</td>
601
- <td>04.01</td>
602
- <td>28.48</td>
603
- <td>05.70</td>
604
- <td>27.24</td>
605
- <td>04.50</td>
606
- <td>22.56</td>
607
- <td>00.03</td>
608
- <td>03.57</td>
609
- <td>01.77</td>
610
- <td>41.73</td>
611
- </tr>
612
- <tr>
613
- <td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
614
- <td>00.87</td>
615
- <td>10.52</td>
616
- <td>04.02</td>
617
- <td>25.29</td>
618
- <td>06.66</td>
619
- <td>23.46</td>
620
- <td>20.14</td>
621
- <td>47.87</td>
622
- <td>0.04</td>
623
- <td>04.77</td>
624
- <td>01.92</td>
625
- <td>66.68</td>
626
- </tr>
627
- <tr>
628
- <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat" target="_blank">AceGPT-7b-chat</a></td>
629
- <td>00.44</td>
630
- <td>11.33</td>
631
- <td>01.05</td>
632
- <td>19.24</td>
633
- <td>06.92</td>
634
- <td>36.03</td>
635
- <td>11.05</td>
636
- <td>44.55</td>
637
- <td>00.06</td>
638
- <td>04.74</td>
639
- <td>02.28</td>
640
- <td>40.23</td>
641
- </tr>
642
- <tr>
643
- <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-13B-chat" target="_blank">AceGPT-13b-chat</a></td>
644
- <td>00.98</td>
645
- <td>16.70</td>
646
- <td>00.81</td>
647
- <td>20.23</td>
648
- <td>08.73</td>
649
- <td>40.76</td>
650
- <td>14.02</td>
651
- <td>48.28</td>
652
- <td>00.12</td>
653
- <td>06.32</td>
654
- <td>02.80</td>
655
- <td>59.58</td>
656
- </tr>
657
- <tr>
658
- <td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
659
- <td>03.10</td>
660
- <td>19.16</td>
661
- <td>01.72</td>
662
- <td>24.35</td>
663
- <td>05.18</td>
664
- <td>36.96</td>
665
- <td>08.23</td>
666
- <td>43.57</td>
667
- <td>00.17</td>
668
- <td>09.14</td>
669
- <td>13.81</td>
670
- <td>59.87</td>
671
- </tr>
672
- <tr>
673
- <td><a href="meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
674
- <td>00.92</td>
675
- <td>14.19</td>
676
- <td>01.46</td>
677
- <td>23.82</td>
678
- <td>08.89</td>
679
- <td>33.08</td>
680
- <td>11.85</td>
681
- <td>35.51</td>
682
- <td>00.11</td>
683
- <td>06.02</td>
684
- <td>01.28</td>
685
- <td>44.08</td>
686
- </tr>
687
- <tr>
688
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-9B" target="_blank">Atlas-Chat-9B</a></strong></td>
689
- <td><b>28.08</td>
690
- <td><b>50.48</td>
691
- <td><b>18.16</td>
692
- <td><b>43.91</td>
693
- <td><b>18.63</td>
694
- <td><b>47.53</td>
695
- <td><b>29.98</td>
696
- <td><b>58.26</td>
697
- <td><b>22.08</td>
698
- <td><b>34.17</td>
699
- <td><b>59.76</td>
700
- <td><b>81.89</td>
701
- </tr>
702
- <tr style="border-top: 4px solid;"></tr>
703
- <tr>
704
- <td><a href="https://huggingface.co/inceptionai/jais-family-30b-8k-chat" target="_blank">jais-family-30b-8k-chat</a></td>
705
- <td>01.10</td>
706
- <td>14.40</td>
707
- <td>01.67</td>
708
- <td>23.37</td>
709
- <td>08.52</td>
710
- <td>35.41</td>
711
- <td>13.71</td>
712
- <td>41.33</td>
713
- <td>00.05</td>
714
- <td>04.48</td>
715
- <td>00.46</td>
716
- <td>56.73</td>
717
- </tr>
718
- <tr>
719
- <td><a href="https://huggingface.co/google/gemma-2-27b-it" target="_blank">gemma-2-27b-it</a></td>
720
- <td>00.67</td>
721
- <td>13.04</td>
722
- <td>01.74</td>
723
- <td>24.63</td>
724
- <td>05.17</td>
725
- <td>37.08</td>
726
- <td>07.36</td>
727
- <td>42.49</td>
728
- <td>00.03</td>
729
- <td>04.94</td>
730
- <td>11.10</td>
731
- <td>57.59</td>
732
- </tr>
733
- <tr>
734
- <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-27B" target="_blank">Atlas-Chat-27B</a></strong></td>
735
- <td><b>29.55</td>
736
- <td><b>51.74</td>
737
- <td><b>19.66</td>
738
- <td><b>45.65</td>
739
- <td><b>20.34</td>
740
- <td><b>49.19</td>
741
- <td><b>31.61</td>
742
- <td><b>59.37</td>
743
- <td><b>33.03</td>
744
- <td><b>40.95</td>
745
- <td><b>60.70</td>
746
- <td>73.00</td>
747
- </tr>
748
-
749
-
750
-
751
- </table>
752
- -->
753
-
754
  ## Evaluation
755
  The Atlas-Chat models were evaluated on a comprehensive suite of tasks using various datasets and benchmarks to assess their performance across multiple dimensions. These included tasks such as:
756
 
@@ -777,14 +337,14 @@ The models were compared against a collection of existing open-source Arabic mod
777
  <tr>
778
  <td><a href="https://huggingface.co/inceptionai/jais-family-1p3b-chat" target="_blank">jais-family-1p3b-chat</a></td>
779
  <td>35.39</td>
780
- <td>32.51</td>
781
  <td>38.33</td>
782
  <td>35.56</td>
783
  </tr>
784
  <tr>
785
  <td><a href="https://huggingface.co/inceptionai/jais-family-2p7b-chat" target="_blank">jais-family-2p7b-chat</a></td>
786
  <td>37.44</td>
787
- <td>34.49</td>
788
  <td>44.11</td>
789
  <td>52.97</td>
790
  </tr>
@@ -812,7 +372,7 @@ The models were compared against a collection of existing open-source Arabic mod
812
  <tr>
813
  <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-2B" target="_blank">Atlas-Chat-2B</a></strong></td>
814
  <td><b>44.97</b></td>
815
- <td><b>41.48</b></td>
816
  <td><b>53.89</b></td>
817
  <td><b>92.31</b></td>
818
  </tr>
@@ -820,35 +380,35 @@ The models were compared against a collection of existing open-source Arabic mod
820
  <tr>
821
  <td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
822
  <td>39.96</td>
823
- <td>41.57</td>
824
  <td>51.22</td>
825
  <td>65.18</td>
826
  </tr>
827
  <tr>
828
  <td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
829
  <td>39.30</td>
830
- <td>35.19</td>
831
  <td>43.67</td>
832
  <td>61.84</td>
833
  </tr>
834
  <tr>
835
  <td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
836
  <td>45.11</td>
837
- <td>43.90</td>
838
  <td>58.67</td>
839
  <td>69.93</td>
840
  </tr>
841
  <tr>
842
  <td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
843
  <td>45.20</td>
844
- <td>40.65</td>
845
  <td>49.67</td>
846
  <td>77.52</td>
847
  </tr>
848
  <tr>
849
  <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat" target="_blank">AceGPT-7b-chat</a></td>
850
  <td>35.98</td>
851
- <td>36.57</td>
852
  <td>30.11</td>
853
  <td>47.31</td>
854
  </tr>
@@ -862,21 +422,21 @@ The models were compared against a collection of existing open-source Arabic mod
862
  <tr>
863
  <td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
864
  <td>35.91</td>
865
- <td>42.43</td>
866
  <td>31.00</td>
867
  <td>90.86</td>
868
  </tr>
869
  <tr>
870
  <td><a href="meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
871
  <td>44.13</td>
872
- <td>38.24</td>
873
  <td>47.00</td>
874
  <td>78.08</td>
875
  </tr>
876
  <tr>
877
  <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-9B" target="_blank">Atlas-Chat-9B</a></strong></td>
878
  <td><b>58.23</b></td>
879
- <td><b>57.75</b></td>
880
  <td><b>74.56</b></td>
881
  <td><b>95.62</b></td>
882
  </tr>
 
311
  Atlas-Chat models are based on Gemma 2 models. The Atlas-Chat models were trained using 8 Nvidia's A100 80 GB GPUs in parallel using FSDP on AWS Sagemaker. The model is trained using HuggingFace transformers and parameter-efficient fine-tuning with LoRA rank of 256.
312
 
313
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
314
  ## Evaluation
315
  The Atlas-Chat models were evaluated on a comprehensive suite of tasks using various datasets and benchmarks to assess their performance across multiple dimensions. These included tasks such as:
316
 
 
337
  <tr>
338
  <td><a href="https://huggingface.co/inceptionai/jais-family-1p3b-chat" target="_blank">jais-family-1p3b-chat</a></td>
339
  <td>35.39</td>
340
+ <td>27.71</td>
341
  <td>38.33</td>
342
  <td>35.56</td>
343
  </tr>
344
  <tr>
345
  <td><a href="https://huggingface.co/inceptionai/jais-family-2p7b-chat" target="_blank">jais-family-2p7b-chat</a></td>
346
  <td>37.44</td>
347
+ <td>29.10</td>
348
  <td>44.11</td>
349
  <td>52.97</td>
350
  </tr>
 
372
  <tr>
373
  <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-2B" target="_blank">Atlas-Chat-2B</a></strong></td>
374
  <td><b>44.97</b></td>
375
+ <td><b>35.08</b></td>
376
  <td><b>53.89</b></td>
377
  <td><b>92.31</b></td>
378
  </tr>
 
380
  <tr>
381
  <td><a href="https://huggingface.co/inceptionai/jais-family-6p7b-chat" target="_blank">jais-family-6p7b-chat</a></td>
382
  <td>39.96</td>
383
+ <td>32.64</td>
384
  <td>51.22</td>
385
  <td>65.18</td>
386
  </tr>
387
  <tr>
388
  <td><a href="https://huggingface.co/inceptionai/jais-adapted-7b-chat" target="_blank">jais-adapted-7b-chat</a></td>
389
  <td>39.30</td>
390
+ <td>29.55</td>
391
  <td>43.67</td>
392
  <td>61.84</td>
393
  </tr>
394
  <tr>
395
  <td><a href="https://huggingface.co/inceptionai/jais-family-13b-chat" target="_blank">jais-family-13b-chat</a></td>
396
  <td>45.11</td>
397
+ <td>33.98</td>
398
  <td>58.67</td>
399
  <td>69.93</td>
400
  </tr>
401
  <tr>
402
  <td><a href="https://huggingface.co/inceptionai/jais-adapted-13b-chat" target="_blank">jais-adapted-13b-chat</a></td>
403
  <td>45.20</td>
404
+ <td>32.84</td>
405
  <td>49.67</td>
406
  <td>77.52</td>
407
  </tr>
408
  <tr>
409
  <td><a href="https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat" target="_blank">AceGPT-7b-chat</a></td>
410
  <td>35.98</td>
411
+ <td>30.33</td>
412
  <td>30.11</td>
413
  <td>47.31</td>
414
  </tr>
 
422
  <tr>
423
  <td><a href="https://huggingface.co/google/gemma-2-9b-it" target="_blank">gemma-2-9b-it</a></td>
424
  <td>35.91</td>
425
+ <td>32.19</td>
426
  <td>31.00</td>
427
  <td>90.86</td>
428
  </tr>
429
  <tr>
430
  <td><a href="meta-llama/Meta-Llama-3.1-8B-Instruct" target="_blank">Llama-3.1-8B-Instruct</a></td>
431
  <td>44.13</td>
432
+ <td>31.40</td>
433
  <td>47.00</td>
434
  <td>78.08</td>
435
  </tr>
436
  <tr>
437
  <td><strong><a href="https://huggingface.co/MBZUAI-Paris/Atlas-Chat-9B" target="_blank">Atlas-Chat-9B</a></strong></td>
438
  <td><b>58.23</b></td>
439
+ <td><b>43.65</b></td>
440
  <td><b>74.56</b></td>
441
  <td><b>95.62</b></td>
442
  </tr>