BERT base Japanese (character-level tokenization with whole word masking, jawiki-20200831)

This pretrained model is almost the same as cl-tohoku/bert-base-japanese-char-v2 but do not need fugashi or unidic_lite. The only difference is in word_tokenzer_type property (specify basic instead of mecab) in tokenizer_config.json.

New

Select AutoNLP in the “Train” menu to fine-tune this model automatically.

Downloads last month
6
Hosted inference API
Fill-Mask
Mask token: [MASK]
This model can be loaded on the Inference API on-demand.