File size: 1,438 Bytes

a471ead

# ERNIE-2.0-large

## Introduction
ERNIE-health is a Chinese biomedical language model pre-trained from in-domain text of de-identified online doctor-patient dialogues, electronic medical records, and textbooks.

More detail: 
https://github.com/PaddlePaddle/Research/tree/master/KG/eHealth
https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-health
https://arxiv.org/pdf/2110.07244.pdf

## Released Model Info

|Model Name|Language|Model Structure|
|:---:|:---:|:---:|
|ernie-health-zh| Chinese |Layer:12, Hidden:768, Heads:12|

This released pytorch model is converted from the officially released PaddlePaddle ERNIE model and 
a series of experiments have been conducted to check the accuracy of the conversion.

- Official PaddlePaddle ERNIE repo:https://github.com/PaddlePaddle/Research/tree/master/KG/eHealth
- Pytorch Conversion repo:  https://github.com/nghuyong/ERNIE-Pytorch

## How to use
```Python
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("nghuyong/ernie-health-zh")
model = AutoModel.from_pretrained("nghuyong/ernie-health-zh")
```

## Citation

```bibtex
@article{wang2021building,
  title={Building Chinese Biomedical Language Models via Multi-Level Text Discrimination},
  author={Wang, Quan and Dai, Songtai and Xu, Benfeng and Lyu, Yajuan and Zhu, Yong and Wu, Hua and Wang, Haifeng},
  journal={arXiv preprint arXiv:2110.07244},
  year={2021}
}
```