Edit model card


简介 Brief Introduction


Adopting a unified framework to handle multiple information extraction tasks, AIWIN2022's champion solution, Chinese UBERT-Base (110M).

模型分类 Model Taxonomy

需求 Demand 任务 Task 系列 Series 模型 Model 参数 Parameter 额外 Extra
通用 General 自然语言理解 NLU 二郎神 Erlangshen UBERT 110M 中文 Chinese

模型信息 Model Information

参考论文:Unified BERT for Few-shot Natural Language Understanding


UBERT was the winner solution in the 2022 AIWIN ARTIFICIAL INTELLIGENCE WORLD INNOVATIONS: Chinese Insurance Small Sample Multi-Task. We developed a unified framework based on BERT-like backbone for multiple tasks and objectives. Our UBERT owns first place, as described in leaderboards A and B. In addition to the unavailable datasets in the challenge, we carefully collect over 70 datasets (1,065,069 samples in total) from a variety of tasks for open-source UBERT. Moreover, we apply MacBERT-Base as the backbone. Besides out-of-the-box functionality, our UBERT can be employed in various scenarios such as NLI, entity recognition, and reading comprehension. The example codes can be found in Github.

使用 Usage

Pip install fengshen:

git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
cd Fengshenbang-LM
pip install --editable ./

Run the code:

import argparse
from fengshen import UbertPiplines

total_parser = argparse.ArgumentParser("TASK NAME")
total_parser = UbertPiplines.piplines_args(total_parser)
args = total_parser.parse_args()

args.pretrained_model_path = "IDEA-CCNL/Erlangshen-Ubert-110M-Chinese"

        "task_type": "抽取任务", 
        "subtask_type": "实体识别", 
        "text": "这也让很多业主据此认为,雅清苑是政府公务员挤对了国家的经适房政策。", 
        "choices": [ 
            {"entity_type": "小区名字"}, 
            {"entity_type": "岗位职责"}
        "id": 0}

model = UbertPiplines(args)
result = model.predict(test_data)
for line in result:

引用 Citation


If you are using the resource for your work, please cite the our paper for this model:

  author    = {JunYu Lu and
               Ping Yang and
               Jiaxing Zhang and
               Ruyi Gan and
               Jing Yang},
  title     = {Unified {BERT} for Few-shot Natural Language Understanding},
  journal   = {CoRR},
  volume    = {abs/2206.12094},
  year      = {2022}


If you are using the resource for your work, please cite the our overview paper:

  author    = {Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen and Ruyi Gan and Jiaxing Zhang},
  title     = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
  journal   = {CoRR},
  volume    = {abs/2209.02970},
  year      = {2022}


You can also cite our website:

Downloads last month
Hosted inference API

Inference API has been turned off for this model.

Spaces using IDEA-CCNL/Erlangshen-Ubert-110M-Chinese 2