wanng commited on
Commit
cf1c234
1 Parent(s): 8166fbd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -8
README.md CHANGED
@@ -13,12 +13,34 @@ inference:
13
 
14
  license: apache-2.0
15
  ---
16
- # Wenzhong2.0-GPT2-3.5B model (chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
17
- As we all know, the single direction language model based on decoder structure has strong generation ability, such as GPT model. The 3.5 billion parameter Wenzhong-GPT2-3.5B large model, using 100G chinese common data, 32 A100 training for 28 hours, is the largest open source **GPT2 large model of chinese**. **Our model performs well in Chinese continuation generation.** **Wenzhong2.0-GPT2-3.5B-Chinese is a Chinese gpt2 model trained with cleaner data on the basis of Wenzhong-GPT2-3.5B.**
18
 
19
- ## Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- ### load model
22
  ```python
23
  from transformers import GPT2Tokenizer, GPT2Model
24
  tokenizer = GPT2Tokenizer.from_pretrained('IDEA-CCNL/Wenzhong2.0-GPT2-3.5B-chinese')
@@ -27,7 +49,9 @@ text = "Replace me by any text you'd like."
27
  encoded_input = tokenizer(text, return_tensors='pt')
28
  output = model(**encoded_input)
29
  ```
30
- ### generation
 
 
31
  ```python
32
  from transformers import pipeline, set_seed
33
  set_seed(55)
@@ -36,13 +60,31 @@ generator("北京位于", max_length=30, num_return_sequences=1)
36
 
37
  ```
38
 
39
- ## Citation
40
- If you find the resource is useful, please cite the following website in your paper.
 
 
 
 
 
 
 
 
 
 
 
 
41
  ```
 
 
 
 
 
 
42
  @misc{Fengshenbang-LM,
43
  title={Fengshenbang-LM},
44
  author={IDEA-CCNL},
45
  year={2021},
46
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
47
  }
48
- ```
 
13
 
14
  license: apache-2.0
15
  ---
 
 
16
 
17
+ # Wenzhong2.0-GPT2-3.5B-chinese
18
+
19
+ - Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
20
+ - Docs: [Fengshenbang-Docs](https://fengshenbang-doc.readthedocs.io/)
21
+
22
+ ## 简介 Brief Introduction
23
+
24
+ 基于悟道数据集预训练,善于处理NLG任务,目前最大的,中文版的GPT2
25
+
26
+ Pretraining on Wudao Corpus, focused on handling NLG tasks, the current largest, Chinese GPT2.
27
+
28
+ ## 模型分类 Model Taxonomy
29
+
30
+ | 需求 Demand | 任务 Task | 系列 Series | 模型 Model | 参数 Parameter | 额外 Extra |
31
+ | :----: | :----: | :----: | :----: | :----: | :----: |
32
+ | 通用 General | 自然语言生成 NLG | 闻仲 Wenzhong | GPT2 | 3.5B | Chinese |
33
+
34
+ ## 模型信息 Model Information
35
+
36
+ 为了可以获得一个强大的单向语言模型,我们采用GPT模型结构,并且应用于中文语料上。类似于Wenzhong-GPT2-3.5B,这个模型拥有30层解码器和35亿参数,这比原本的GPT2-XL还要大。不同的是,我们把这个模型在悟道(300G版本)语料上进行预训练。据我们所知,它是目前最大的中文的GPT模型。
37
+
38
+ To obtain a powerful unidirectional language model, we adopt the GPT model structure and apply it to the Chinese corpus. Similar to Wenzhong-GPT2-3.5B, this model has 30 decoder layers and 3.5 billion parameters, which is larger than the original GPT2-XL. The difference is that we pre-trained this model on the Wudao (300G version) corpus. To the best of our knowledge, it is the largest Chinese GPT model currently available.
39
+
40
+ ## 使用 Usage
41
+
42
+ ### 加载模型 Loading Models
43
 
 
44
  ```python
45
  from transformers import GPT2Tokenizer, GPT2Model
46
  tokenizer = GPT2Tokenizer.from_pretrained('IDEA-CCNL/Wenzhong2.0-GPT2-3.5B-chinese')
 
49
  encoded_input = tokenizer(text, return_tensors='pt')
50
  output = model(**encoded_input)
51
  ```
52
+
53
+ ### 使用示例 Usage Examples
54
+
55
  ```python
56
  from transformers import pipeline, set_seed
57
  set_seed(55)
 
60
 
61
  ```
62
 
63
+ ## 引用 Citation
64
+
65
+ 如果您在您的工作中使用了我们的模型,可以引用我们的[论文](https://arxiv.org/abs/2209.02970):
66
+
67
+ If you are using the resource for your work, please cite the our [paper](https://arxiv.org/abs/2209.02970):
68
+
69
+ ```text
70
+ @article{fengshenbang,
71
+ author = {Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen and Ruyi Gan and Jiaxing Zhang},
72
+ title = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
73
+ journal = {CoRR},
74
+ volume = {abs/2209.02970},
75
+ year = {2022}
76
+ }
77
  ```
78
+
79
+ 也可以引用我们的[网站](https://github.com/IDEA-CCNL/Fengshenbang-LM/):
80
+
81
+ You can also cite our [website](https://github.com/IDEA-CCNL/Fengshenbang-LM/):
82
+
83
+ ```text
84
  @misc{Fengshenbang-LM,
85
  title={Fengshenbang-LM},
86
  author={IDEA-CCNL},
87
  year={2021},
88
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
89
  }
90
+ ```