dongxq commited on
Commit
1a08814
1 Parent(s): e5cf564

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -21,11 +21,11 @@ Good at solving text summarization tasks, after fine-tuning on multiple Chinese
21
 
22
  | 需求 Demand | 任务 Task | 系列 Series | 模型 Model | 参数 Parameter | 额外 Extra |
23
  | :----: | :----: | :----: | :----: | :----: | :----: |
24
- | 通用 General | 自然语言转换 NLT | 燃灯 Randeng | PEFASUS | 238M | 文本摘要任务-中文 Summary-Chinese |
25
 
26
  ## 模型信息 Model Information
27
 
28
- 参考论文:[PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/pdf/1912.08777.pdf)
29
 
30
  ### 下游效果 Performance
31
 
@@ -36,20 +36,20 @@ Good at solving text summarization tasks, after fine-tuning on multiple Chinese
36
  ## 使用 Usage
37
 
38
  ```python
39
- from transformers import PegasusForConditionalGeneration,BertTokenizer
40
- # Need to download tokenizers_pegasus.py and other Python script from Fengshenbang-LM github repo in advance,
41
- # or you can download tokenizers_pegasus.py and data_utils.py in https://huggingface.co/IDEA-CCNL/Randeng_Pegasus_523M/tree/main
42
  # Strongly recommend you git clone the Fengshenbang-LM repo:
43
  # 1. git clone https://github.com/IDEA-CCNL/Fengshenbang-LM
44
- # 2. cd Fengshenbang-LM/fengshen/examples/translation/
45
- # and then you will see the tokenizers_pegasus.py and data_utils.py which are needed by pegasus model
46
 
47
- from tokenizers_pegasus import PegasusTokenizer
48
 
49
- model = PegasusForConditionalGeneration.from_pretrained("IDEA-CCNL/Randeng-Pegasus-238M-Summary-Chinese")
50
- tokenizer = PegasusTokenizer.from_pretrained("IDEA-CCNL/Randeng-Pegasus-238M-Summary-Chinese")
51
 
52
- text = "在北京冬奥会自由式滑雪女子坡面障碍技巧决赛中,中国选手谷爱凌夺得银牌。祝贺谷爱凌!今天上午,自由式滑雪女子坡面障碍技巧决赛举行。决赛分三轮进行,取选手最佳成绩排名决出奖牌。第一跳,中国选手谷爱凌获得69.90分。在12位选手中排名第三。完成动作后,谷爱凌又扮了个鬼脸,甚是可爱。第二轮中,谷爱凌在道具区第三个障碍处失误,落地时摔倒。获得16.98分。网友:摔倒了也没关系,继续加油!在第二跳失误摔倒的情况下,谷爱凌顶住压力,第三跳稳稳发挥,流畅落地!获得86.23分!此轮比赛,共12位选手参赛,谷爱凌第10位出场。网友:看比赛时我比谷爱凌紧张,加油!"
53
  inputs = tokenizer(text, max_length=1024, return_tensors="pt")
54
 
55
  # Generate Summary
 
21
 
22
  | 需求 Demand | 任务 Task | 系列 Series | 模型 Model | 参数 Parameter | 额外 Extra |
23
  | :----: | :----: | :----: | :----: | :----: | :----: |
24
+ | 通用 General | 自然语言转换 NLT | 燃灯 Randeng | Deltalm | 362M | 翻译任务 Zh-En |
25
 
26
  ## 模型信息 Model Information
27
 
28
+ 参考论文:[DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders](https://arxiv.org/pdf/2106.13736v2.pdf)
29
 
30
  ### 下游效果 Performance
31
 
 
36
  ## 使用 Usage
37
 
38
  ```python
39
+ from transformers import AutoTokenizer
40
+ # Need to download modeling_deltalm.py from Fengshenbang-LM github repo in advance,
41
+ # or you can download modeling_deltalm.py in
42
  # Strongly recommend you git clone the Fengshenbang-LM repo:
43
  # 1. git clone https://github.com/IDEA-CCNL/Fengshenbang-LM
44
+ # 2. cd Fengshenbang-LM/fengshen/examples/deltalm/
45
+ # and then you will see the modeling_deltalm.py which are needed by deltalm model
46
 
47
+ from modeling_deltalm import DeltalmForConditionalGeneration
48
 
49
+ model = DeltalmForConditionalGeneration.from_pretrained("IDEA-CCNL/Randeng-Deltalm-362M-Zh-En")
50
+ tokenizer = AutoTokenizer.from_pretrained("microsoft/infoxlm-base")
51
 
52
+ text = ""
53
  inputs = tokenizer(text, max_length=1024, return_tensors="pt")
54
 
55
  # Generate Summary