Model Card for Model ID
Medium-sized ModernBERT trained on a custom corpus written mainly in Simplified Chinese using WordLevel tokenization (equivalently, tokenization determined by the corpus files). The custom corpus consists of the entire Chinese Treebank 9.0 and the first half of the "XIN_CMN"-portion of the Tagged Chinese Gigaword Version 2.0.
- Downloads last month
- 10