metadata

language:
  - zh
license: apache-2.0
tags:
  - classification
inference: false

Erlangshen-TCBert-110M-Classification-Chinese

Github: Fengshenbang-LM
Docs: Fengshenbang-Docs

简介 Brief Introduction

110M参数的Topic Classification BERT (TCBert)，。

The TCBert with 110M parameters is pre-trained for, not limited to, Chinese topic classification tasks.

模型分类 Model Taxonomy

需求 Demand	任务 Task	系列 Series	模型 Model	参数 Parameter	额外 Extra
通用 General	自然语言理解 NLU	二郎神 Erlangshen	TCBert	110M	Chinese

模型信息 Model Information

为了提高模型在句子匹配上的效果，我们收集了大量句子匹配数据进行预训练，随后在FewCLUE的BUSTM任务进行微调，所有的训练均基于我们提出的UniMC框架。最终结果表明，3.25亿参数的模型通过我们的训练策略可以在句子匹配任务上超过1.3亿参数的大模型。

To improve the model performance on the topic classification task, we collected numerous topic classification datasets for pre-training based on general prompts.

下游效果 Performance

Stay tuned!!!

使用 Usage

from transformers import BertForMaskedLM, BertTokenizer
import torch
tokenizer=BertTokenizer.from_pretrained("IDEA-CCNL/Erlangshen-TCBert-110M-Classification-Chinese")
model=BertForMaskedLM.from_pretrained('IDEA-CCNL/Erlangshen-TCBert-110M-Classification-Chinese')

如果您在您的工作中使用了我们的模型，可以引用我们的网站:

You can also cite our website:

@misc{Fengshenbang-LM,
  title={Fengshenbang-LM},
  author={IDEA-CCNL},
  year={2021},
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}