Randeng-T5-784M / README.md
roygan's picture
Update README.md
064ad4f
metadata
language:
  - zh
license: apache-2.0
tags:
  - T5
  - chinese
  - sentencepiece
inference: true
widget:
  - text: 北京有悠久的 <extra_id_0>和 <extra_id_1>。
  - type: text-generation

Randeng-T5-784M, one model of Fengshenbang-LM.

Based on mt5-large, Randeng-T5-784M only retains the vocabulary and embedding corresponding to Chinese and English, and continues to train on the basis of 180G Chinese general pre-training corpus. The pretrain target is span corruption. We pretrain the model based on our fengshen framework, use 16 * A100 for 98 hours.

Usage

from transformers import T5ForConditionalGeneration, AutoTokenizer
import torch
tokenizer=AutoTokenizer.from_pretrained('IDEA-CCNL/Randeng-T5-784M', use_fast=false)
model=T5ForConditionalGeneration.from_pretrained('IDEA-CCNL/Randeng-T5-784M')

Citation

If you find the resource is useful, please cite the following website in your paper.

@misc{Fengshenbang-LM,
  title={Fengshenbang-LM},
  author={IDEA-CCNL},
  year={2022},
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}