uer
/

roberta-base-finetuned-dianping-chinese

@@ -9,7 +9,9 @@ widget:
 ## Model description
-This is the set of 5 Chinese RoBERTa-Base classification models fine-tuned by [UER-py](https://arxiv.org/abs/1909.05658). You can download the 5 Chinese RoBERTa-Base classification models either from the [UER-py Modelzoo page](https://github.com/dbiir/UER-py/wiki/Modelzoo) (in UER-py format), or via HuggingFace from the links below:
 |    Dataset     |                           Link                            |
 | :-----------: | :-------------------------------------------------------: |
@@ -34,7 +36,7 @@ You can use this model directly with a pipeline for text classification (take th
 ## Training data
-5 Chinese text classification datasets are used. JD full, JD binary, and Dianping datasets consist of user reviews of different sentiment polarities. Ifeng and Chinanews consist of first paragraphs of news articles of different topic classes. They are collected by [Glyph](https://github.com/zhangxiangxiao/glyph) project and more details are discussed in corresponding [paper](https://arxiv.org/abs/1708.02657).
 ## Training procedure
@@ -62,13 +64,6 @@ python3 scripts/convert_bert_text_classification_from_uer_to_huggingface.py --in
 ### BibTeX entry and citation info
 ```
-@article{devlin2018bert,
-  title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
-  author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
-  journal={arXiv preprint arXiv:1810.04805},
-  year={2018}
-}
 @article{liu2019roberta,
   title={Roberta: A robustly optimized bert pretraining approach},
   author={Liu, Yinhan and Ott, Myle and Goyal, Naman and Du, Jingfei and Joshi, Mandar and Chen, Danqi and Levy, Omer and Lewis, Mike and Zettlemoyer, Luke and Stoyanov, Veselin},
@@ -90,6 +85,13 @@ python3 scripts/convert_bert_text_classification_from_uer_to_huggingface.py --in
   pages={241},
   year={2019}
 }
 ```
 [jd_full]:https://huggingface.co/uer/roberta-base-finetuned-jd-full-chinese

 ## Model description
+This is the set of 5 Chinese RoBERTa-Base classification models fine-tuned by [UER-py](https://github.com/dbiir/UER-py/), which is introduced in [this paper](https://arxiv.org/abs/1909.05658). Besides, the models could also be fine-tuned by [TencentPretrain](https://github.com/Tencent/TencentPretrain) introduced in [this paper](https://arxiv.org/abs/2212.06385), which inherits UER-py to support models with parameters above one billion, and extends it to a multimodal pre-training framework.
+You can download the 5 Chinese RoBERTa-Base classification models either from the [UER-py Modelzoo page](https://github.com/dbiir/UER-py/wiki/Modelzoo), or via HuggingFace from the links below:
 |    Dataset     |                           Link                            |
 | :-----------: | :-------------------------------------------------------: |
 ## Training data
+5 Chinese text classification datasets are used. JD full, JD binary, and Dianping datasets consist of user reviews of different sentiment polarities. Ifeng and Chinanews consist of first paragraphs of news articles of different topic classes. They are collected by [Glyph](https://github.com/zhangxiangxiao/glyph) project and more details are discussed in the corresponding [paper](https://arxiv.org/abs/1708.02657).
 ## Training procedure
 ### BibTeX entry and citation info
 ```
 @article{liu2019roberta,
   title={Roberta: A robustly optimized bert pretraining approach},
   author={Liu, Yinhan and Ott, Myle and Goyal, Naman and Du, Jingfei and Joshi, Mandar and Chen, Danqi and Levy, Omer and Lewis, Mike and Zettlemoyer, Luke and Stoyanov, Veselin},
   pages={241},
   year={2019}
 }
+@article{zhao2023tencentpretrain,
+  title={TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities},
+  author={Zhao, Zhe and Li, Yudong and Hou, Cheng and Zhao, Jing and others},
+  journal={ACL 2023},
+  pages={217},
+  year={2023}
 ```
 [jd_full]:https://huggingface.co/uer/roberta-base-finetuned-jd-full-chinese