jpacifico commited on
Commit
2e7e425
·
verified ·
1 Parent(s): ee374ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -20
README.md CHANGED
@@ -25,33 +25,26 @@ For English usage, however, [version 1.1](https://huggingface.co/jpacifico/Choco
25
 
26
  ### MT-Bench-French
27
 
28
- Chocolatine-14B-Instruct-DPO-v1.2 is outperforming Phi-3-medium-4k-instruct and its previous version in French tasks.
29
- Please note that the [Chocolatine-3B](jpacifico/Chocolatine-3B-Instruct-DPO-Revised) model is very close to Phi-3-Medium in French, which is a significant achievement.
30
 
31
  ```
32
  ########## First turn ##########
33
  score
34
  model turn
35
  gpt-4o-mini 1 9.28750
36
- Chocolatine-14B-Instruct-4k-DPO 1 8.63750
37
  Chocolatine-14B-Instruct-DPO-v1.2 1 8.61250
38
  Phi-3-medium-4k-instruct 1 8.22500
39
  gpt-3.5-turbo 1 8.13750
40
  Chocolatine-3B-Instruct-DPO-Revised 1 7.98750
41
  Daredevil-8B 1 7.88750
42
- Daredevil-8B-abliterated 1 7.83750
43
- Chocolatine-3B-Instruct-DPO-v1.0 1 7.68750
44
  NeuralDaredevil-8B-abliterated 1 7.62500
45
  Phi-3-mini-4k-instruct 1 7.21250
46
- Meta-Llama-3-8B-Instruct 1 7.16250
47
  Meta-Llama-3.1-8B-Instruct 1 7.05000
48
  vigostral-7b-chat 1 6.78750
49
  Mistral-7B-Instruct-v0.3 1 6.75000
50
  gemma-2-2b-it 1 6.45000
51
- Mistral-7B-Instruct-v0.2 1 6.28750
52
  French-Alpaca-7B-Instruct_beta 1 5.68750
53
  vigogne-2-7b-chat 1 5.66250
54
- vigogne-2-7b-instruct 1 5.13750
55
 
56
  ########## Second turn ##########
57
  score
@@ -60,46 +53,34 @@ gpt-4o-mini 2 8.912500
60
  Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
61
  Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
62
  Phi-3-medium-4k-instruct 2 7.750000
63
- Chocolatine-14B-Instruct-4k-DPO 2 7.737500
64
  gpt-3.5-turbo 2 7.679167
65
- Chocolatine-3B-Instruct-DPO-v1.0 2 7.612500
66
  NeuralDaredevil-8B-abliterated 2 7.125000
67
  Daredevil-8B 2 7.087500
68
- Daredevil-8B-abliterated 2 6.873418
69
- Meta-Llama-3-8B-Instruct 2 6.800000
70
  Meta-Llama-3.1-8B-Instruct 2 6.787500
71
- Mistral-7B-Instruct-v0.2 2 6.512500
72
  Mistral-7B-Instruct-v0.3 2 6.500000
73
  Phi-3-mini-4k-instruct 2 6.487500
74
  vigostral-7b-chat 2 6.162500
75
  gemma-2-2b-it 2 6.100000
76
  French-Alpaca-7B-Instruct_beta 2 5.487395
77
  vigogne-2-7b-chat 2 2.775000
78
- vigogne-2-7b-instruct 2 2.240506
79
 
80
  ########## Average ##########
81
  score
82
  model
83
  gpt-4o-mini 9.100000
84
  Chocolatine-14B-Instruct-DPO-v1.2 8.475000
85
- Chocolatine-14B-Instruct-4k-DPO 8.187500
86
  Phi-3-medium-4k-instruct 7.987500
87
  Chocolatine-3B-Instruct-DPO-Revised 7.962500
88
  gpt-3.5-turbo 7.908333
89
- Chocolatine-3B-Instruct-DPO-v1.0 7.650000
90
  Daredevil-8B 7.487500
91
  NeuralDaredevil-8B-abliterated 7.375000
92
- Daredevil-8B-abliterated 7.358491
93
- Meta-Llama-3-8B-Instruct 6.981250
94
  Meta-Llama-3.1-8B-Instruct 6.918750
95
  Phi-3-mini-4k-instruct 6.850000
96
  Mistral-7B-Instruct-v0.3 6.625000
97
  vigostral-7b-chat 6.475000
98
- Mistral-7B-Instruct-v0.2 6.400000
99
  gemma-2-2b-it 6.275000
100
  French-Alpaca-7B-Instruct_beta 5.587866
101
  vigogne-2-7b-chat 4.218750
102
- vigogne-2-7b-instruct 3.698113
103
  ```
104
 
105
  ### Usage
 
25
 
26
  ### MT-Bench-French
27
 
28
+ Chocolatine-14B-Instruct-DPO-v1.2 is outperforming its base model Phi-3-medium-4k-instruct on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge.
 
29
 
30
  ```
31
  ########## First turn ##########
32
  score
33
  model turn
34
  gpt-4o-mini 1 9.28750
 
35
  Chocolatine-14B-Instruct-DPO-v1.2 1 8.61250
36
  Phi-3-medium-4k-instruct 1 8.22500
37
  gpt-3.5-turbo 1 8.13750
38
  Chocolatine-3B-Instruct-DPO-Revised 1 7.98750
39
  Daredevil-8B 1 7.88750
 
 
40
  NeuralDaredevil-8B-abliterated 1 7.62500
41
  Phi-3-mini-4k-instruct 1 7.21250
 
42
  Meta-Llama-3.1-8B-Instruct 1 7.05000
43
  vigostral-7b-chat 1 6.78750
44
  Mistral-7B-Instruct-v0.3 1 6.75000
45
  gemma-2-2b-it 1 6.45000
 
46
  French-Alpaca-7B-Instruct_beta 1 5.68750
47
  vigogne-2-7b-chat 1 5.66250
 
48
 
49
  ########## Second turn ##########
50
  score
 
53
  Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
54
  Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
55
  Phi-3-medium-4k-instruct 2 7.750000
 
56
  gpt-3.5-turbo 2 7.679167
 
57
  NeuralDaredevil-8B-abliterated 2 7.125000
58
  Daredevil-8B 2 7.087500
 
 
59
  Meta-Llama-3.1-8B-Instruct 2 6.787500
 
60
  Mistral-7B-Instruct-v0.3 2 6.500000
61
  Phi-3-mini-4k-instruct 2 6.487500
62
  vigostral-7b-chat 2 6.162500
63
  gemma-2-2b-it 2 6.100000
64
  French-Alpaca-7B-Instruct_beta 2 5.487395
65
  vigogne-2-7b-chat 2 2.775000
 
66
 
67
  ########## Average ##########
68
  score
69
  model
70
  gpt-4o-mini 9.100000
71
  Chocolatine-14B-Instruct-DPO-v1.2 8.475000
 
72
  Phi-3-medium-4k-instruct 7.987500
73
  Chocolatine-3B-Instruct-DPO-Revised 7.962500
74
  gpt-3.5-turbo 7.908333
 
75
  Daredevil-8B 7.487500
76
  NeuralDaredevil-8B-abliterated 7.375000
 
 
77
  Meta-Llama-3.1-8B-Instruct 6.918750
78
  Phi-3-mini-4k-instruct 6.850000
79
  Mistral-7B-Instruct-v0.3 6.625000
80
  vigostral-7b-chat 6.475000
 
81
  gemma-2-2b-it 6.275000
82
  French-Alpaca-7B-Instruct_beta 5.587866
83
  vigogne-2-7b-chat 4.218750
 
84
  ```
85
 
86
  ### Usage