kobkrit commited on
Commit
ed74716
1 Parent(s): 0009f1d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -29,15 +29,15 @@ tags:
29
 
30
  | **Exams** | **OTG 7b (Aug 2023)** | **OTG 13b (Dec 2023)** | <b style="color:blue">OTG 7b (March 2024)</b> | **OTG 13b (March 2024)** | **OTG 70b (March 2024)** | **SeaLLM 7b v1** | **SeaLLM 7b v2** | **SeaLion 7b** | **WanchanGLM 7b** | **Sailor-7b-Chat** | **TyphoonGPT 7b Instruct** | **GPT3.5** | **GPT4** | **Gemini Pro** | **Gemini 1.5** | **Claude 3 Haiku** | **Claude 3 Sonnet** | **Claude 3 Opus** |
31
  |----------------------------|-----------------------|------------------------|-------------------------|--------------------------|--------------------------|------------------|------------------|----------------|-------------------|--------------------|----------------------------|------------|----------|----------------|----------------|--------------------|---------------------|-------------------|
32
- | **A-Level** | 17.50% | 34.17% | <b style="color:blue">25.00%</b> | 30.83% | 45.83% | 18.33% | 34.17% | 21.67% | 17.50% | 40.00% | 37.50% | 38.33% | 65.83% | 56.67% | 55.83% | 58.33% | 59.17% | 77.50% |
33
  | **TGAT** | 24.00% | 22.00% | <b style="color:blue">22.00%</b> | 36.00% | 36.00% | 14.00% | 28.00% | 24.00% | 16.00% | 34.00% | 30.00% | 28.00% | 44.00% | 22.00% | 28.00% | 36.00% | 34.00% | 46.00% |
34
  | **TPAT1** | 22.50% | 47.50% | <b style="color:blue">42.50%</b> | 27.50% | 62.50% | 22.50% | 27.50% | 22.50% | 17.50% | 40.00% | 47.50% | 45.00% | 52.50% | 52.50% | 50.00% | 52.50% | 50.00% | 62.50% |
35
- | **Investment Consultant** | 8.00% | 28.00% | <b style="color:blue">76.00%</b> | 84.00% | 68.00% | 16.00% | 28.00% | 24.00% | 16.00% | 24.00% | 32.00% | 40.00% | 64.00% | 52.00% | 32.00% | 44.00% | 64.00% | 72.00% |
36
- | **Facebook Belebele Thai** | 25.00% | 45.00% | <b style="color:blue">34.50%</b> | 39.50% | 70.00% | 13.50% | 51.00% | 27.00% | 24.50% | 63.00% | 51.50% | 50.00% | 72.50% | 65.00% | 74.00% | 63.50% | 77.00% | 90.00% |
37
  | **xcopa_th_200** | 45.00% | 56.50% | <b style="color:blue">49.50%</b> | 51.50% | 74.50% | 26.50% | 47.00% | 51.50% | 48.50% | 68.50% | 65.00% | 64.00% | 82.00% | 68.00% | 74.00% | 64.00% | 80.00% | 86.00% |
38
  | **xnli2.0_th_200** | 33.50% | 34.50% | <b style="color:blue">39.50%</b> | 31.00% | 47.00% | 21.00% | 43.00% | 37.50% | 33.50% | 16.00% | 20.00% | 50.00% | 69.00% | 53.00% | 54.50% | 50.00% | 68.00% | 68.50% |
39
- | **ONET_M3** | 17.85% | 38.86% | <b style="color:blue">34.11%</b> | 39.36% | 56.15% | 15.58% | 23.92% | 21.79% | 19.56% | 21.37% | 28.03% | 37.91% | 49.97% | 55.99% | 57.41% | 52.73% | 40.60% | 63.87% |
40
- | **ONET_M6** | 21.14% | 28.87% | <b style="color:blue">22.53%</b> | 23.32% | 42.85% | 15.09% | 19.48% | 16.96% | 20.67% | 28.64% | 27.46% | 34.44% | 46.29% | 45.53% | 50.23% | 34.79% | 38.49% | 48.56% |
41
  | **AVERAGE SCORE** | 23.83% | 37.27% | <b style="color:blue;font-size:1.3em">38.40%</b> | 40.33% | 55.87% | 18.06% | 33.56% | 27.44% | 23.75% | 37.28% | 37.67% | 43.07% | 60.68% | 52.30% | 52.89% | 50.65% | 56.81% | 68.32% |
42
  Benchmark source code and exams information: https://github.com/OpenThaiGPT/openthaigpt_eval
43
 
 
29
 
30
  | **Exams** | **OTG 7b (Aug 2023)** | **OTG 13b (Dec 2023)** | <b style="color:blue">OTG 7b (March 2024)</b> | **OTG 13b (March 2024)** | **OTG 70b (March 2024)** | **SeaLLM 7b v1** | **SeaLLM 7b v2** | **SeaLion 7b** | **WanchanGLM 7b** | **Sailor-7b-Chat** | **TyphoonGPT 7b Instruct** | **GPT3.5** | **GPT4** | **Gemini Pro** | **Gemini 1.5** | **Claude 3 Haiku** | **Claude 3 Sonnet** | **Claude 3 Opus** |
31
  |----------------------------|-----------------------|------------------------|-------------------------|--------------------------|--------------------------|------------------|------------------|----------------|-------------------|--------------------|----------------------------|------------|----------|----------------|----------------|--------------------|---------------------|-------------------|
32
+ | **A-Level** | 17.50% | 34.17% | <b style="color:blue">25.00%</b> | 30.83% | 45.83% | 18.33% | 34.17% | 21.67% | 17.50% | 40.00% | 37.50% | 38.33% | 65.83% | 56.67% | 55.83% | 58.33% | 59.17% | 77.50% |
33
  | **TGAT** | 24.00% | 22.00% | <b style="color:blue">22.00%</b> | 36.00% | 36.00% | 14.00% | 28.00% | 24.00% | 16.00% | 34.00% | 30.00% | 28.00% | 44.00% | 22.00% | 28.00% | 36.00% | 34.00% | 46.00% |
34
  | **TPAT1** | 22.50% | 47.50% | <b style="color:blue">42.50%</b> | 27.50% | 62.50% | 22.50% | 27.50% | 22.50% | 17.50% | 40.00% | 47.50% | 45.00% | 52.50% | 52.50% | 50.00% | 52.50% | 50.00% | 62.50% |
35
+ | **thai_investment_consultant_exams** | 8.00% | 28.00% | <b style="color:blue">76.00%</b> | 84.00% | 68.00% | 16.00% | 28.00% | 24.00% | 16.00% | 24.00% | 32.00% | 40.00% | 64.00% | 52.00% | 32.00% | 44.00% | 64.00% | 72.00% |
36
+ | **facebook_beleble_tha_200** | 25.00% | 45.00% | <b style="color:blue">34.50%</b> | 39.50% | 70.00% | 13.50% | 51.00% | 27.00% | 24.50% | 63.00% | 51.50% | 50.00% | 72.50% | 65.00% | 74.00% | 63.50% | 77.00% | 90.00% |
37
  | **xcopa_th_200** | 45.00% | 56.50% | <b style="color:blue">49.50%</b> | 51.50% | 74.50% | 26.50% | 47.00% | 51.50% | 48.50% | 68.50% | 65.00% | 64.00% | 82.00% | 68.00% | 74.00% | 64.00% | 80.00% | 86.00% |
38
  | **xnli2.0_th_200** | 33.50% | 34.50% | <b style="color:blue">39.50%</b> | 31.00% | 47.00% | 21.00% | 43.00% | 37.50% | 33.50% | 16.00% | 20.00% | 50.00% | 69.00% | 53.00% | 54.50% | 50.00% | 68.00% | 68.50% |
39
+ | **ONET M3** | 17.85% | 38.86% | <b style="color:blue">34.11%</b> | 39.36% | 56.15% | 15.58% | 23.92% | 21.79% | 19.56% | 21.37% | 28.03% | 37.91% | 49.97% | 55.99% | 57.41% | 52.73% | 40.60% | 63.87% |
40
+ | **ONET M6** | 21.14% | 28.87% | <b style="color:blue">22.53%</b> | 23.32% | 42.85% | 15.09% | 19.48% | 16.96% | 20.67% | 28.64% | 27.46% | 34.44% | 46.29% | 45.53% | 50.23% | 34.79% | 38.49% | 48.56% |
41
  | **AVERAGE SCORE** | 23.83% | 37.27% | <b style="color:blue;font-size:1.3em">38.40%</b> | 40.33% | 55.87% | 18.06% | 33.56% | 27.44% | 23.75% | 37.28% | 37.67% | 43.07% | 60.68% | 52.30% | 52.89% | 50.65% | 56.81% | 68.32% |
42
  Benchmark source code and exams information: https://github.com/OpenThaiGPT/openthaigpt_eval
43