qywu commited on
Commit
2955d05
1 Parent(s): 4a35d92

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -119,10 +119,10 @@ This model has the same license as the [original Gemma model collection](https:/
119
  |-----------------------------------------|------|-------|-----------|------|------------|------------|-------|
120
  | google/gemma-2b | 46.37| 48.38 | 71.77 | 41.77| 33.08 | 66.77 | 16.91 |
121
  | google/gemma-2b-it | 42.75| 43.94 | 62.70 | 37.65| 45.82 | 60.93 | 5.46 |
122
- | wandb/gemma-2b-zephyr-sft | 47.18| 49.74 | 72.38 | 41.37| 34.42 | 66.93 | 18.27 |
123
  | wandb/gemma-2b-zephyr-dpo | 46.92| 49.66 | 72.23 | 41.13| 34.47 | 66.54 | 17.51 |
124
- | Columbia-NLP/gemma-2b-zephyr-sft | 48.75| 51.80 | 72.63 | 42.20| 41.96 | 63.85 | 20.09 |
125
- | **Columbia-NLP/gemma-2b-zephyr-dpo** | 49.14| 52.22 | 73.11 | 42.55| 42.64 | 64.40 | 19.94 |
126
 
127
 
128
  ## MT-Bench
@@ -131,8 +131,8 @@ We evaluate our model with `GPT-4-0125-preview` as the judge.
131
 
132
  | Model | Total | Coding | Extraction | Humanities | Math | Reasoning | Roleplay | STEM | Writing |
133
  |------------------------------------------|-------|--------|------------|------------|------|-----------|----------|------|---------|
134
- | google/gemma-2b-it | 4.71 | 2.95 | 4.35 | 6.15 | 2.90 | 3.50 | 5.60 | 5.50 | 6.70 |
135
  | wandb/gemma-2b-zephyr-sft | 4.03 | 3.10 | 3.15 | 5.00 | 2.70 | 2.65 | 5.10 | 4.80 | 5.75 |
136
  | wandb/gemma-2b-zephyr-dpo | 4.06 | 2.80 | 2.90 | 5.55 | 2.65 | 2.70 | 5.20 | 4.80 | 5.85 |
137
  | Columbia-NLP/gemma-2b-zephyr-sft | 4.34 | 3.10 | 3.70 | 6.25 | 2.65 | 2.70 | 5.55 | 5.25 | 5.50 |
138
- | **Columbia-NLP/gemma-2b-zephyr-dpo** | **4.75** | 3.50 | 4.05 | 6.75 | 3.30 | 3.70 | 5.85 | 5.40 | 5.53 |
 
119
  |-----------------------------------------|------|-------|-----------|------|------------|------------|-------|
120
  | google/gemma-2b | 46.37| 48.38 | 71.77 | 41.77| 33.08 | 66.77 | 16.91 |
121
  | google/gemma-2b-it | 42.75| 43.94 | 62.70 | 37.65| 45.82 | 60.93 | 5.46 |
122
+ | wandb/gemma-2b-zephyr-sft | 47.18| 49.74 | 72.38 | 41.37| 34.42 | **66.93** | 18.27 |
123
  | wandb/gemma-2b-zephyr-dpo | 46.92| 49.66 | 72.23 | 41.13| 34.47 | 66.54 | 17.51 |
124
+ | Columbia-NLP/gemma-2b-zephyr-sft | 48.75| 51.80 | 72.63 | 42.20| 41.96 | 63.85 | **20.09** |
125
+ | **Columbia-NLP/gemma-2b-zephyr-dpo** | **49.14**| **52.22** | **73.11** | **42.55**| **42.64** | 64.40 | 19.94 |
126
 
127
 
128
  ## MT-Bench
 
131
 
132
  | Model | Total | Coding | Extraction | Humanities | Math | Reasoning | Roleplay | STEM | Writing |
133
  |------------------------------------------|-------|--------|------------|------------|------|-----------|----------|------|---------|
134
+ | google/gemma-2b-it | 4.71 | 2.95 | **4.35** | 6.15 | 2.90 | 3.50 | 5.60 | **5.50** | **6.70** |
135
  | wandb/gemma-2b-zephyr-sft | 4.03 | 3.10 | 3.15 | 5.00 | 2.70 | 2.65 | 5.10 | 4.80 | 5.75 |
136
  | wandb/gemma-2b-zephyr-dpo | 4.06 | 2.80 | 2.90 | 5.55 | 2.65 | 2.70 | 5.20 | 4.80 | 5.85 |
137
  | Columbia-NLP/gemma-2b-zephyr-sft | 4.34 | 3.10 | 3.70 | 6.25 | 2.65 | 2.70 | 5.55 | 5.25 | 5.50 |
138
+ | **Columbia-NLP/gemma-2b-zephyr-dpo** | **4.75** | **3.50** | 4.05 | **6.75** | **3.30** | **3.70** | **5.85** | 5.40 | 5.53 |