|
--- |
|
language: |
|
- ms |
|
- en |
|
- zh |
|
- ta |
|
--- |
|
|
|
# Llama 3.2 3B Malaysian Reasoning |
|
|
|
Continue finetuning https://huggingface.co/meta-llama/Llama-3.2-1B on highly curated 1.2B tokens Malaysian instruction including reasoning dataset. |
|
|
|
## Improvement |
|
|
|
1. 128k context length. |
|
2. Support respond in Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu. |
|
3. Able to code in Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu. |
|
4. Multi-turn Malaysian context such as related to Malaysian Legislation, politics, religions and languages. |
|
5. Standard RAG. |
|
6. Reasoning! Support minimal reasoning in Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan and Terengganu. |
|
|
|
## MalayMMLU |
|
|
|
``` |
|
``` |
|
|
|
## Training session |
|
|
|
We done 2 stage of training, |
|
|
|
1. Finetune on [Malaysian SFT](https://huggingface.co/datasets/mesolitica/Malaysian-SFT) to make the model understand Malaysian context. |
|
- Wandb at https://wandb.ai/huseinzol05/lora-embedding-256-llama3.2-1b-small-malaysian-reasoning |
|
2. Continue finetune on [Malaysian Reasoning](https://huggingface.co/datasets/mesolitica/Malaysian-Reasoning) including small samples of [Malaysian SFT](https://huggingface.co/datasets/mesolitica/Malaysian-SFT) to make it become reasoning model. |
|
- Wandb at https://wandb.ai/huseinzol05/lora-embedding-256-llama3.2-1b-small-malaysian-reasoning-cont |
|
|
|
## How we train |
|
|
|
1. LoRA on `["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "embed_tokens", "lm_head"]`. |
|
2. 256 Rank with alpha 512, or alpha of 2.0 |
|
3. Multipacking with proper SDPA causal masking to prevent document contamination and also make sure proper position ids. |
|
4. Forked CCE loss for LoRA `lm_head` to reduce memory consumption. |
|
|
|
Low Rank adapters pushed at [malayloraenjoyer/Llama-3.2-1B-Malaysian-Reasoning-LoRA](https://huggingface.co/malayloraenjoyer/Llama-3.2-1B-Malaysian-Reasoning-LoRA). |
|
|
|
Source code at https://github.com/mesolitica/malaya/tree/master/session/small-malaysian-reasoning |