|
--- |
|
language: |
|
- zh |
|
license: apache-2.0 |
|
|
|
tags: |
|
- T5 |
|
- chinese |
|
- sentencepiece |
|
|
|
inference: true |
|
|
|
widget: |
|
- text: "北京有悠久的 <extra_id_0>和 <extra_id_1>。" |
|
- type: "text-generation" |
|
|
|
--- |
|
# Randeng-T5-784M, one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM). |
|
Based on mt5-large, Randeng-T5-784M only retains the vocabulary and embedding corresponding to Chinese and English, and continues to train on the basis of 180G Chinese general pre-training corpus. The pretrain target is span corruption. We pretrain the model based on our [fengshen framework](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen), use 16 * A100 for 98 hours. |
|
## Usage |
|
```python |
|
from transformers import T5ForConditionalGeneration, AutoTokenizer |
|
import torch |
|
tokenizer=AutoTokenizer.from_pretrained('IDEA-CCNL/Randeng-T5-784M', use_fast=false) |
|
model=T5ForConditionalGeneration.from_pretrained('IDEA-CCNL/Randeng-T5-784M') |
|
``` |
|
## Citation |
|
If you find the resource is useful, please cite the following website in your paper. |
|
``` |
|
@misc{Fengshenbang-LM, |
|
title={Fengshenbang-LM}, |
|
author={IDEA-CCNL}, |
|
year={2022}, |
|
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}}, |
|
} |
|
``` |