This model repository presents "TinyPubMedBERT", a distillated PubMedBERT (Gu et al., 2021) model.
The model is composed of 4-layers and distillated following methods introduced in the TinyBERT paper (Jiao et al., 2020).
- For the framework, please visit https://github.com/AstraZeneca/KAZU
- For the demo, please visit http://kazu.korea.ac.kr
- For details about the model, please see our paper entitled Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework, (EMNLP 2022 industry track).
TinyPubMedBERT is used as the initial weights for the training of the dmis-lab/KAZU-NER-module-distil-v1.0 for the KAZU (Korea University and AstraZeneca) framework.
Citation info
Joint-first authorship of Richard Jackson (AstraZeneca) and WonJin Yoon (Korea University).
Please cite the paper using the simplified citation format provided in the following section, or find the full citation information here
@inproceedings{YoonAndJackson2022BiomedicalNER,
title="Biomedical {NER} for the Enterprise with Distillated {BERN}2 and the Kazu Framework",
author="Yoon, Wonjin and Jackson, Richard and Ford, Elliot and Poroshin, Vladimir and Kang, Jaewoo",
booktitle="Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track",
month = dec,
year = "2022",
address = "Abu Dhabi, UAE",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.emnlp-industry.63",
pages = "619--626",
}
This model used resources from PubMedBERT paper and TinyBERT paper.
Gu, Yu, et al. "Domain-specific language model pretraining for biomedical natural language processing."
ACM Transactions on Computing for Healthcare (HEALTH) 3.1 (2021): 1-23.
Jiao, Xiaoqi, et al. "TinyBERT: Distilling BERT for Natural Language Understanding."
Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
Contact Information
For help or issues using the codes or model (NER module of KAZU) in this repository, please contact WonJin Yoon (wonjin.info (at) gmail.com) or submit a GitHub issue.
- Downloads last month
- 28