IDEA-CCNL
/

Randeng-T5-784M

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

Randeng-T5-784M / README.md

roygan's picture

Update README.md

19a36b4 about 2 years ago

|

No virus

1.3 kB

	---
	language:
	- zh
	license: apache-2.0

	tags:
	- T5
	- chinese
	- sentencepiece

	inference: true

	widget:
	- text: "北京有悠久的 <extra_id_0>和 <extra_id_1>。"
	- type: "text-generation"

	---
	# Randeng-T5-784M, one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
	Based on mt5-large, Randeng-T5-784M only retains the vocabulary and embedding corresponding to Chinese and English, and continues to train on the basis of 180G Chinese general pre-training corpus. Because we continue pretraining on mt5-large, the tokenizer use T5tokenizer(sentencepiece). The pretrain target is span corruption. We pretrain the model based on our [fengshen framework](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen), use 16 * A100 for 98 hours.
	## Usage
	```python
	from transformers import T5ForConditionalGeneration, AutoTokenizer
	import torch
	tokenizer=AutoTokenizer.from_pretrained('IDEA-CCNL/Randeng-T5-784M', use_fast=false)
	model=T5ForConditionalGeneration.from_pretrained('IDEA-CCNL/Randeng-T5-784M')
	```
	## Citation
	If you find the resource is useful, please cite the following website in your paper.
	```
	@misc{Fengshenbang-LM,
	title={Fengshenbang-LM},
	author={IDEA-CCNL},
	year={2022},
	howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
	}
	```