dongxq commited on
Commit
2eb44eb
1 Parent(s): 7c649b2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -10
README.md CHANGED
@@ -10,12 +10,14 @@ inference: False
10
 
11
  # Randeng-Deltalm-362M-Zh-En
12
 
 
 
13
 
14
  ## 简介 Brief Introduction
15
 
16
- 使用封神框架,在搜集的中英数据集上,基于 detalm 进行 finetune,得到中 -> 英方向的翻译模型
17
 
18
- Using the Fengshen-LM framework, on the collected Chinese-English dataset, finetune based on detalm, and get a translation model in the Chinese->English direction
19
 
20
  ## 模型分类 Model Taxonomy
21
 
@@ -36,27 +38,26 @@ Using the Fengshen-LM framework, on the collected Chinese-English dataset, finet
36
  ## 使用 Usage
37
 
38
  ```python
39
- from transformers import AutoTokenizer
40
  # Need to download modeling_deltalm.py from Fengshenbang-LM github repo in advance,
41
  # or you can download modeling_deltalm.py in
42
  # Strongly recommend you git clone the Fengshenbang-LM repo:
43
  # 1. git clone https://github.com/IDEA-CCNL/Fengshenbang-LM
44
  # 2. cd Fengshenbang-LM/fengshen/examples/deltalm/
45
- # and then you will see the modeling_deltalm.py which are needed by deltalm model
46
 
47
  from modeling_deltalm import DeltalmForConditionalGeneration
 
48
 
49
  model = DeltalmForConditionalGeneration.from_pretrained("IDEA-CCNL/Randeng-Deltalm-362M-Zh-En")
50
  tokenizer = AutoTokenizer.from_pretrained("microsoft/infoxlm-base")
51
 
52
- text = ""
53
- inputs = tokenizer(text, max_length=1024, return_tensors="pt")
54
 
55
- # Generate Summary
56
- summary_ids = model.generate(inputs["input_ids"])
57
- tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
58
 
59
- # model Output: 滑雪女子坡面障碍技巧决赛谷爱凌获银牌
60
  ```
61
 
62
  ## 引用 Citation
 
10
 
11
  # Randeng-Deltalm-362M-Zh-En
12
 
13
+ - Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/fengshen/examples/)
14
+ - Docs: [Fengshenbang-Docs](https://fengshenbang-doc.readthedocs.io/zh/latest/docs/%E7%87%83%E7%81%AF%E7%B3%BB%E5%88%97/)
15
 
16
  ## 简介 Brief Introduction
17
 
18
+ 使用封神框架基于 Detalm base 进行finetune ,搜集的中英数据集(共3千万条)以及 iwslt的中英平行数据(20万),得到中 -> 英方向的翻译模型
19
 
20
+ Using the Fengshen-LM framework and finetuning based on detalm , get a translation model in the Chinese->English direction
21
 
22
  ## 模型分类 Model Taxonomy
23
 
 
38
  ## 使用 Usage
39
 
40
  ```python
41
+
42
  # Need to download modeling_deltalm.py from Fengshenbang-LM github repo in advance,
43
  # or you can download modeling_deltalm.py in
44
  # Strongly recommend you git clone the Fengshenbang-LM repo:
45
  # 1. git clone https://github.com/IDEA-CCNL/Fengshenbang-LM
46
  # 2. cd Fengshenbang-LM/fengshen/examples/deltalm/
 
47
 
48
  from modeling_deltalm import DeltalmForConditionalGeneration
49
+ from transformers import AutoTokenizer
50
 
51
  model = DeltalmForConditionalGeneration.from_pretrained("IDEA-CCNL/Randeng-Deltalm-362M-Zh-En")
52
  tokenizer = AutoTokenizer.from_pretrained("microsoft/infoxlm-base")
53
 
54
+ text = "尤其在夏天,如果你决定徒步穿越雨林,就需要小心蚊子。"
55
+ inputs = tokenizer(text, max_length=512, return_tensors="pt")
56
 
57
+ generate_ids = model.generate(inputs["input_ids"], max_length=512)
58
+ tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
 
59
 
60
+ # model Output:
61
  ```
62
 
63
  ## 引用 Citation