Wenzhong-GPT2-3.5B / README.md
Zimix's picture
Update README.md
6a0aad9
|
raw
history blame
No virus
1.3 kB
metadata
language:
  - zh
license: apache-2.0
widget:
  - text: 生活的真谛是[MASK]。

Yuyuan-3.5B model (chinese),one model of Fengshenbang-LM.

As we all know, the single direction language model based on decoder structure has strong generation ability, such as GPT model. The 3.5 billion parameter Wenzhong-3.5B large model, using 100G chinese common data, 32 A100 training for 28 hours, is the largest open source GPT2 large model of chinese. Our model performs well in Chinese continuation generation.

Usage

load model

from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained('IDEA-CCNL/Wenzhong-3.5B')
model = GPT2Model.from_pretrained('IDEA-CCNL/Wenzhong-3.5B')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

generation

from transformers import pipeline, set_seed

set_seed(55)

generator = pipeline('text-generation', model='IDEA-CCNL/Wenzhong-3.5B')

generator("北京位于", max_length=30, num_return_sequences=1)

Citation

If you find the resource is useful, please cite the following website in your paper.

https://github.com/IDEA-CCNL/Fengshenbang-LM