Wenzhong-GPT2-3.5B / README.md
Zimix's picture
add readme
19d6f53
|
raw
history blame
No virus
948 Bytes
metadata
language:
  - zh
license: apache-2.0
widget:
  - text: 生活的真谛是[MASK]。

Yuyuan-3.5B model (chinese),one model of Fengshenbang-LM.

As we all know, the single direction language model based on decoder structure has strong generation ability, such as GPT model. The 3.5 billion parameter Wenzhong-3.5B large model, using 100G chinese common data, 32 A100 training for 28 hours, is the largest open source GPT2 large model of chinese. Our model performs well in Chinese continuation generation.

Usage

from transformers import pipeline, set_seed

set_seed(55)

generator = pipeline('text-generation', model='IDEA-CCNL/Wenzhong-3.5B')

generator("北京位于", max_length=30, num_return_sequences=1)

Citation

If you find the resource is useful, please cite the following website in your paper.

https://github.com/IDEA-CCNL/Fengshenbang-LM