uer commited on
Commit
c9ebda6
1 Parent(s): 80bdcf2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -2
README.md CHANGED
@@ -12,9 +12,9 @@ widget:
12
 
13
  ## Model description
14
 
15
- This is the set of 5 Chinese word-based RoBERTa models pre-trained by [UER-py](https://github.com/dbiir/UER-py/), which is introduced in [this paper](https://arxiv.org/abs/1909.05658).
16
 
17
- Most Chinese pre-trained weights are based on Chinese character. Compared with character-based models, word-based models are faster (because of shorter sequence length) and have better performance according to our experimental results. To this end, we released the 5 Chinese word-based RoBERTa models of different sizes. In order to facilitate users to reproduce the results, we used the publicly available corpus and word segmentation tool, and provided all training details.
18
 
19
  Notice that the output results of Hosted inference API (right) are not properly displayed. When the predicted word has multiple characters, the single word instead of entire sentence is displayed. One can click **JSON Output** for normal output results.
20
 
@@ -212,6 +212,13 @@ python3 scripts/convert_bert_from_uer_to_huggingface.py --input_model_path model
212
  pages={241},
213
  year={2019}
214
  }
 
 
 
 
 
 
 
215
  ```
216
 
217
  [2_128]:https://huggingface.co/uer/roberta-tiny-word-chinese-cluecorpussmall
 
12
 
13
  ## Model description
14
 
15
+ This is the set of 5 Chinese word-based RoBERTa models pre-trained by [UER-py](https://github.com/dbiir/UER-py/), which is introduced in [this paper](https://arxiv.org/abs/1909.05658). Besides, the models could also be pre-trained by [TencentPretrain](https://github.com/Tencent/TencentPretrain) introduced in [this paper](https://arxiv.org/abs/2212.06385), which inherits UER-py to support models with parameters above one billion, and extends it to a multimodal pre-training framework.
16
 
17
+ Most Chinese pre-trained weights are based on Chinese character. Compared with character-based models, word-based models are faster (because of shorter sequence length) and have better performance according to our experimental results. To this end, we released the 5 Chinese word-based RoBERTa models of different sizes. In order to facilitate users in reproducing the results, we used a publicly available corpus and word segmentation tool, and provided all training details.
18
 
19
  Notice that the output results of Hosted inference API (right) are not properly displayed. When the predicted word has multiple characters, the single word instead of entire sentence is displayed. One can click **JSON Output** for normal output results.
20
 
 
212
  pages={241},
213
  year={2019}
214
  }
215
+
216
+ @article{zhao2023tencentpretrain,
217
+ title={TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities},
218
+ author={Zhao, Zhe and Li, Yudong and Hou, Cheng and Zhao, Jing and others},
219
+ journal={ACL 2023},
220
+ pages={217},
221
+ year={2023}
222
  ```
223
 
224
  [2_128]:https://huggingface.co/uer/roberta-tiny-word-chinese-cluecorpussmall