weiqipedia
commited on
Commit
•
1bfc2a0
1
Parent(s):
701ce03
Update metrics in README.md
Browse files
README.md
CHANGED
@@ -68,13 +68,14 @@ For Natural Language Reasoning (NLR) tasks, we tested the model on Natural Langu
|
|
68 |
| Model | QA (F1) | Sentiment (F1) | Toxicity (F1) | Eng>Indo (ChrF++) | Indo>Eng (ChrF++) | Summary (ROUGE-L) | NLI (Acc) | Causal (Acc) |
|
69 |
|--------------------------------|---------|----------------|---------------|-------------------|-------------------|-------------------|-----------|--------------|
|
70 |
| SEA-LION-7B-Instruct-Research | 24.86 | 76.13 | 24.45 | 52.50 | 46.82 | 15.44 | 33.20 | 23.80 |
|
71 |
-
| SEA-LION-7B-Instruct | **68.41
|
72 |
| SeaLLM 7B v1 | 30.96 | 56.29 | 22.60 | 62.23 | 41.55 | 14.03 | 26.50 | 56.60 |
|
73 |
-
| SeaLLM 7B v2 | 44.40 | 80.13 | **55.24**
|
74 |
-
| Sailor-7B (Base) | 65.43 | 59.48 | 20.48 | **64.27**
|
|
|
75 |
| Llama 2 7B Chat | 11.12 | 52.32 | 0.00 | 44.09 | 57.58 | 9.24 | 0.00 | 0.00 |
|
76 |
| Mistral 7B Instruct v0.1 | 38.85 | 74.38 | 20.83 | 30.60 | 51.43 | 15.63 | 28.60 | 50.80 |
|
77 |
-
| GPT-4
|
78 |
|
79 |
## Technical Specifications
|
80 |
|
|
|
68 |
| Model | QA (F1) | Sentiment (F1) | Toxicity (F1) | Eng>Indo (ChrF++) | Indo>Eng (ChrF++) | Summary (ROUGE-L) | NLI (Acc) | Causal (Acc) |
|
69 |
|--------------------------------|---------|----------------|---------------|-------------------|-------------------|-------------------|-----------|--------------|
|
70 |
| SEA-LION-7B-Instruct-Research | 24.86 | 76.13 | 24.45 | 52.50 | 46.82 | 15.44 | 33.20 | 23.80 |
|
71 |
+
| SEA-LION-7B-Instruct | **68.41**| **91.45** | 17.98 | 57.48 | 58.04 | **17.54** | 53.10 | 60.80 |
|
72 |
| SeaLLM 7B v1 | 30.96 | 56.29 | 22.60 | 62.23 | 41.55 | 14.03 | 26.50 | 56.60 |
|
73 |
+
| SeaLLM 7B v2 | 44.40 | 80.13 | **55.24** | 64.01 | **63.28** | 17.31 | 43.60 | 82.00 |
|
74 |
+
| Sailor-7B (Base) | 65.43 | 59.48 | 20.48 | **64.27** | 60.68 | 8.69 | 15.10 | 38.40 |
|
75 |
+
| Sailor-7B-Chat | 38.02 | 87.64 | 52.07 | 64.25 | 61.87 | 15.28 | **68.30** |**85.60** |
|
76 |
| Llama 2 7B Chat | 11.12 | 52.32 | 0.00 | 44.09 | 57.58 | 9.24 | 0.00 | 0.00 |
|
77 |
| Mistral 7B Instruct v0.1 | 38.85 | 74.38 | 20.83 | 30.60 | 51.43 | 15.63 | 28.60 | 50.80 |
|
78 |
+
| GPT-4 (gpt-4-0314) | 73.60 | 74.14 | 63.96 | 69.38 | 67.53 | 18.71 | 83.20 | 96.00 |
|
79 |
|
80 |
## Technical Specifications
|
81 |
|