wanng commited on
Commit
5f3b6b4
1 Parent(s): 77b2a4b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -30,16 +30,16 @@ Focused on handling NLG tasks, the current largest, Chinese GPT2.
30
  | :----: | :----: | :----: | :----: | :----: | :----: |
31
  | 通用 General | 自然语言生成 NLG| 闻仲 Wenzhong | GPT2 | 3.5B | - |
32
 
 
33
 
34
- one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
35
- As we all know, the single direction language model based on decoder structure has strong generation ability, such as GPT model. **The 3.5 billion parameter Wenzhong-GPT2-3.5B large model, using 100G chinese common data, 32 A100 training for 28 hours,** is the largest open source **GPT2 large model of chinese**. **Our model performs well in Chinese continuation generation.**
36
 
 
37
 
 
38
 
 
39
 
40
- ## Usage
41
-
42
- ### load model
43
  ```python
44
  from transformers import GPT2Tokenizer, GPT2Model
45
  tokenizer = GPT2Tokenizer.from_pretrained('IDEA-CCNL/Wenzhong-GPT2-3.5B')
@@ -48,7 +48,9 @@ text = "Replace me by any text you'd like."
48
  encoded_input = tokenizer(text, return_tensors='pt')
49
  output = model(**encoded_input)
50
  ```
51
- ### generation
 
 
52
  ```python
53
  from transformers import pipeline, set_seed
54
  set_seed(55)
 
30
  | :----: | :----: | :----: | :----: | :----: | :----: |
31
  | 通用 General | 自然语言生成 NLG| 闻仲 Wenzhong | GPT2 | 3.5B | - |
32
 
33
+ ## 模型信息 Model Information
34
 
35
+ 为了可以获得一个强大的单向语言模型,我们采用GPT模型结构,并且应用于中文语料上。具体地,这个模型拥有30层解码器和35亿参数,这比原本的GPT2-XL还要大。我们在100G的中文语料上预训练,这消耗了32个NVIDIA A100显卡大约28小时。据我们所知,它是目前最大的中文的GPT模型。
 
36
 
37
+ To obtain a robust one-way language model, we adopt the GPT model structure and apply it to the Chinese corpus. Specifically, this model has 30 decoder layers and 3.5 billion parameters, which is larger than the original GPT2-XL. We pre-train it on 100G of Chinese corpus, which consumes 32 NVIDIA A100 GPUs for about 28 hours. To the best of our knowledge, it is the largest Chinese GPT model currently available.
38
 
39
+ ## 使用 Usage
40
 
41
+ ### 加载模型 Loading Models
42
 
 
 
 
43
  ```python
44
  from transformers import GPT2Tokenizer, GPT2Model
45
  tokenizer = GPT2Tokenizer.from_pretrained('IDEA-CCNL/Wenzhong-GPT2-3.5B')
 
48
  encoded_input = tokenizer(text, return_tensors='pt')
49
  output = model(**encoded_input)
50
  ```
51
+
52
+ ### 使用示例 Usage Examples
53
+
54
  ```python
55
  from transformers import pipeline, set_seed
56
  set_seed(55)