Pretrained XLNET base language model for Malay.
xlnet-base-bahasa-cased model was pretrained on ~1.4 Billion words. Below is list of data we trained on,
- All steps can reproduce from here, Malaya/pretrained-model/xlnet.
Load Pretrained Model
You can use this model by installing
tensorflow and Huggingface library
transformers. And you can use it directly by initializing it like this:
from transformers import XLNetModel, XLNetTokenizer model = XLNetModel.from_pretrained('malay-huggingface/xlnet-base-bahasa-cased') tokenizer = XLNetTokenizer.from_pretrained( 'malay-huggingface/xlnet-base-bahasa-cased', do_lower_case = False, )
- Downloads last month
Unable to determine this model’s pipeline type. Check the docs .