Malaysian CausalLM
Collection
Trained on 21B tokens, 91GB of cleaned texts, able to understand standard Malay, local Malay, local Mandarin, Manglish, and local Tamil.
•
4 items
•
Updated
•
1
README at https://github.com/mesolitica/malaya/tree/5.1/pretrained-model/mistral
WandB, https://wandb.ai/malaysia-ai2020/mistral-474M?workspace=user-malaysia-ai2020