IDEA-CCNL
/

Randeng-Deltalm-362M-Zh-En

text2text-generation

Model card Files Files and versions Community

dongxq commited on Dec 21, 2022

Commit

2eb44eb

•

1 Parent(s): 7c649b2

Update README.md

Files changed (1) hide show

README.md +11 -10

README.md CHANGED Viewed

@@ -10,12 +10,14 @@ inference: False
 # Randeng-Deltalm-362M-Zh-En
 ## 简介 Brief Introduction
-使用封神框架，在搜集的中英数据集上，基于 detalm 进行 finetune，得到中 -> 英方向的翻译模型
-Using the Fengshen-LM framework, on the collected Chinese-English dataset, finetune based on detalm, and get a translation model in the Chinese->English direction
 ## 模型分类 Model Taxonomy
@@ -36,27 +38,26 @@ Using the Fengshen-LM framework, on the collected Chinese-English dataset, finet
 ## 使用 Usage
 ```python
-from transformers import AutoTokenizer
 # Need to download modeling_deltalm.py from Fengshenbang-LM github repo in advance,
 # or you can download modeling_deltalm.py in
 # Strongly recommend you git clone the Fengshenbang-LM repo:
 # 1. git clone https://github.com/IDEA-CCNL/Fengshenbang-LM
 # 2. cd Fengshenbang-LM/fengshen/examples/deltalm/
-# and then you will see the modeling_deltalm.py which are needed by deltalm model
 from modeling_deltalm import DeltalmForConditionalGeneration
 model = DeltalmForConditionalGeneration.from_pretrained("IDEA-CCNL/Randeng-Deltalm-362M-Zh-En")
 tokenizer = AutoTokenizer.from_pretrained("microsoft/infoxlm-base")
-text = ""
-inputs = tokenizer(text, max_length=1024, return_tensors="pt")
-# Generate Summary
-summary_ids = model.generate(inputs["input_ids"])
-tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
-# model Output: 滑雪女子坡面障碍技巧决赛谷爱凌获银牌
 ```
 ## 引用 Citation

 # Randeng-Deltalm-362M-Zh-En
+- Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/fengshen/examples/)
+- Docs: [Fengshenbang-Docs](https://fengshenbang-doc.readthedocs.io/zh/latest/docs/%E7%87%83%E7%81%AF%E7%B3%BB%E5%88%97/)
 ## 简介 Brief Introduction
+使用封神框架基于 Detalm base 进行finetune ，搜集的中英数据集（共3千万条）以及 iwslt的中英平行数据（20万），得到中 -> 英方向的翻译模型
+Using the Fengshen-LM framework and finetuning based on detalm , get a translation model in the Chinese->English direction
 ## 模型分类 Model Taxonomy
 ## 使用 Usage
 ```python
 # Need to download modeling_deltalm.py from Fengshenbang-LM github repo in advance,
 # or you can download modeling_deltalm.py in
 # Strongly recommend you git clone the Fengshenbang-LM repo:
 # 1. git clone https://github.com/IDEA-CCNL/Fengshenbang-LM
 # 2. cd Fengshenbang-LM/fengshen/examples/deltalm/
 from modeling_deltalm import DeltalmForConditionalGeneration
+from transformers import AutoTokenizer
 model = DeltalmForConditionalGeneration.from_pretrained("IDEA-CCNL/Randeng-Deltalm-362M-Zh-En")
 tokenizer = AutoTokenizer.from_pretrained("microsoft/infoxlm-base")
+text = "尤其在夏天,如果你决定徒步穿越雨林,就需要小心蚊子。"
+inputs = tokenizer(text, max_length=512, return_tensors="pt")
+generate_ids = model.generate(inputs["input_ids"], max_length=512)
+tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
+# model Output:
 ```
 ## 引用 Citation