uer
/

roberta-base-word-chinese-cluecorpussmall

Inference Endpoints

Model card Files Files and versions Community

uer commited on Aug 30, 2023

Commit

c9ebda6

•

1 Parent(s): 80bdcf2

Update README.md

Files changed (1) hide show

README.md +9 -2

README.md CHANGED Viewed

@@ -12,9 +12,9 @@ widget:
 ## Model description
-This is the set of 5 Chinese word-based RoBERTa models pre-trained by [UER-py](https://github.com/dbiir/UER-py/), which is introduced in [this paper](https://arxiv.org/abs/1909.05658).
-Most Chinese pre-trained weights are based on Chinese character. Compared with character-based models, word-based models are faster (because of shorter sequence length) and have better performance according to our experimental results. To this end, we released the 5 Chinese word-based RoBERTa models of different sizes. In order to facilitate users to reproduce the results, we used the publicly available corpus and word segmentation tool, and provided all training details.
 Notice that the output results of Hosted inference API (right) are not properly displayed. When the predicted word has multiple characters, the single word instead of entire sentence is displayed. One can click **JSON Output** for normal output results.
@@ -212,6 +212,13 @@ python3 scripts/convert_bert_from_uer_to_huggingface.py --input_model_path model
   pages={241},
   year={2019}
 }
 ```
 [2_128]:https://huggingface.co/uer/roberta-tiny-word-chinese-cluecorpussmall

 ## Model description
+This is the set of 5 Chinese word-based RoBERTa models pre-trained by [UER-py](https://github.com/dbiir/UER-py/), which is introduced in [this paper](https://arxiv.org/abs/1909.05658). Besides, the models could also be pre-trained by [TencentPretrain](https://github.com/Tencent/TencentPretrain) introduced in [this paper](https://arxiv.org/abs/2212.06385), which inherits UER-py to support models with parameters above one billion, and extends it to a multimodal pre-training framework.
+Most Chinese pre-trained weights are based on Chinese character. Compared with character-based models, word-based models are faster (because of shorter sequence length) and have better performance according to our experimental results. To this end, we released the 5 Chinese word-based RoBERTa models of different sizes. In order to facilitate users in reproducing the results, we used a publicly available corpus and word segmentation tool, and provided all training details.
 Notice that the output results of Hosted inference API (right) are not properly displayed. When the predicted word has multiple characters, the single word instead of entire sentence is displayed. One can click **JSON Output** for normal output results.
   pages={241},
   year={2019}
 }
+@article{zhao2023tencentpretrain,
+  title={TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities},
+  author={Zhao, Zhe and Li, Yudong and Hou, Cheng and Zhao, Jing and others},
+  journal={ACL 2023},
+  pages={217},
+  year={2023}
 ```
 [2_128]:https://huggingface.co/uer/roberta-tiny-word-chinese-cluecorpussmall