Update README.md
Browse files
README.md
CHANGED
@@ -20,14 +20,14 @@ base_model: meta-llama/Llama-3.1-70B-Instruct
|
|
20 |
# Llama3.1 70B CPT SEA-LIONv3
|
21 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
22 |
|
23 |
-
Llama3.1 70B CPT SEA-LIONv3 Base is a multilingual model which has undergone continued pre-training on approximately **200B** tokens across 11 SEA languages:
|
24 |
|
25 |
SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
|
26 |
|
27 |
- **Developed by:** Products Pillar, AI Singapore
|
28 |
- **Funded by:** Singapore NRF
|
29 |
- **Model type:** Decoder
|
30 |
-
- **Languages supported:**
|
31 |
- **License:** [Llama 3.1 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE)
|
32 |
|
33 |
## Model Details
|
@@ -47,8 +47,14 @@ Note: SEA-HELM is implemented using prompts to elicit answers in a strict format
|
|
47 |
|
48 |
The evaluation was done **five-shot** with native prompts on a sample of 100-1000 instances for each dataset.
|
49 |
|
|
|
|
|
50 |
For more details on Llama3.1 70B CPT SEA-LIONv3 base benchmark performance, please refer to the SEA-HELM leaderboard, https://leaderboard.sea-lion.ai/
|
51 |
|
|
|
|
|
|
|
|
|
52 |
## Technical Specifications
|
53 |
### Infrastructure
|
54 |
Llama3.1 70B CPT SEA-LIONv3 was trained in two stages using [MosaicML Composer](https://github.com/mosaicml/composer) on the following hardware:
|
|
|
20 |
# Llama3.1 70B CPT SEA-LIONv3
|
21 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
22 |
|
23 |
+
Llama3.1 70B CPT SEA-LIONv3 Base is a multilingual model which has undergone continued pre-training on approximately **200B** tokens across 11 SEA languages: Burmese, Chinese, English, Filipino, Indonesia, Khmer, Lao, Malay, Tamil, Thai and Vietnamese.
|
24 |
|
25 |
SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
|
26 |
|
27 |
- **Developed by:** Products Pillar, AI Singapore
|
28 |
- **Funded by:** Singapore NRF
|
29 |
- **Model type:** Decoder
|
30 |
+
- **Languages supported:** Burmese, Chinese, English, Filipino, Indonesia, Khmer, Lao, Malay, Tamil, Thai, Vietnamese.
|
31 |
- **License:** [Llama 3.1 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE)
|
32 |
|
33 |
## Model Details
|
|
|
47 |
|
48 |
The evaluation was done **five-shot** with native prompts on a sample of 100-1000 instances for each dataset.
|
49 |
|
50 |
+
Following the implementation of IFEval in OpenLLM leaderboard, we also implement SEA-IFEval to provide a comparison of the ability of the model to follow specific constraints in English and in SEA languages.
|
51 |
+
|
52 |
For more details on Llama3.1 70B CPT SEA-LIONv3 base benchmark performance, please refer to the SEA-HELM leaderboard, https://leaderboard.sea-lion.ai/
|
53 |
|
54 |
+
**SEA-IFEval**
|
55 |
+
|
56 |
+
SEA-IFEval evaluates a model's ability to adhere to constraints provided in the prompt, for example beginning a response with a specific word/phrase or answering with a certain number of sections. Additionally, accuracy is normalised by the proportion of responses in the correct language (if the model performs the task correctly but responds in the wrong language, it is judged to have failed the task).
|
57 |
+
|
58 |
## Technical Specifications
|
59 |
### Infrastructure
|
60 |
Llama3.1 70B CPT SEA-LIONv3 was trained in two stages using [MosaicML Composer](https://github.com/mosaicml/composer) on the following hardware:
|