Update README.md
Browse files
README.md
CHANGED
@@ -10,8 +10,8 @@ pipeline_tag: text-generation
|
|
10 |
| Model | IFEval - TH | IFEval - EN | MT-Bench TH | MT-Bench EN | Thai Code-Switching(t=0.7) | Thai Code-Switching(t=1.0) | FunctionCall-TH | FunctionCall-EN | GSM8K-TH | GSM8K-EN | MATH-TH | MATH-EN | HumanEval-TH | HumanEval-EN | MBPP-TH | MBPP-EN |
|
11 |
|--------------------------------|-------------|-------------|-------------|-------------|--------------------------------|--------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-------------|-------------|-----------|-----------|
|
12 |
| **Typhoon2 Llama3.1 70B Instruct**| **81.45%** | 88.72% | **7.3626** | 8.8562 | **98.8%** | **94.8%** | **70.8%** | 65.7% | **88.79%** | **93.43%** | **59.60%** | 64.96% | 79.9% | 83.5% | 86.0% | 84.9% |
|
13 |
-
| **Llama3.3
|
14 |
-
| **Openthaigpt1.5
|
15 |
|
16 |
|
17 |
# TODO add image - general / domain specific / long context
|
|
|
10 |
| Model | IFEval - TH | IFEval - EN | MT-Bench TH | MT-Bench EN | Thai Code-Switching(t=0.7) | Thai Code-Switching(t=1.0) | FunctionCall-TH | FunctionCall-EN | GSM8K-TH | GSM8K-EN | MATH-TH | MATH-EN | HumanEval-TH | HumanEval-EN | MBPP-TH | MBPP-EN |
|
11 |
|--------------------------------|-------------|-------------|-------------|-------------|--------------------------------|--------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-------------|-------------|-----------|-----------|
|
12 |
| **Typhoon2 Llama3.1 70B Instruct**| **81.45%** | 88.72% | **7.3626** | 8.8562 | **98.8%** | **94.8%** | **70.8%** | 65.7% | **88.79%** | **93.43%** | **59.60%** | 64.96% | 79.9% | 83.5% | 86.0% | 84.9% |
|
13 |
+
| **Llama3.3 70B Instruct** | 81.01% | **91.51%** | 6.7967 | 8.8343 | 72.6% | 39.2% | 50.3% | 56.3% | 61.63% | 87.71% | 44.37% | **73.58%** | 81.7% | 84.1% | 84.9% | 87.3% |
|
14 |
+
| **Openthaigpt1.5 72B** | 80.37% | 84.56% | 7.3131 | **9.0893** | 95.6% | 50.4% | 67.1% | **74.6%** | 79.15% | 89.91% | 43.65% | 81.8% | **81.7%** | **84.8%** | **88.9%** | **89.7%** |
|
15 |
|
16 |
|
17 |
# TODO add image - general / domain specific / long context
|