BioM-Transformers: Building Large Biomedical Language Models with BERT, ALBERT and ELECTRA Abstract The impact of design choices on the performance of biomedical language models recently has been a subject for investigation. In this paper, we empirically study biomedical domain adaptation with large transformer models using different design choices. We evaluate the performance of our pretrained models against other existing biomedical language models in the literature. Our results show that we achieve state-of-the-art results on several biomedical domain tasks despite using similar or less computational cost compared to other models in the literature. Our findings highlight the significant effect of design choices on improving the performance of biomedical language models. This model is fine-tuned on the SQuAD2.0 dataset. Fine-tuning the biomedical language model on the SQuAD dataset helps improve the score on the BioASQ challenge. If you plan to work with BioASQ or biomedical QA tasks, it's better to use this model over BioM-ALBERT-xxlarge. Huggingface library doesn't implement the Layer-Wise decay feature, which affects the performance on the SQuAD task. The reported result of BioM-ALBERT-xxlarge-SQuAD in our paper is 87.00 (F1) since we use ALBERT open-source code with TF checkpoint, which uses Layer-Wise decay. Citation ```bibtex @inproceedings{alrowili-shanker-2021-biom, title = "{B}io{M}-Transformers: Building Large Biomedical Language Models with {BERT}, {ALBERT} and {ELECTRA}", author = "Alrowili, Sultan and Shanker, Vijay", booktitle = "Proceedings of the 20th Workshop on Biomedical Language Processing", month = jun, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2021.bionlp-1.24", pages = "221--227", abstract = "The impact of design choices on the performance of biomedical language models recently has been a subject for investigation. In this paper, we empirically study biomedical domain adaptation with large transformer models using different design choices. We evaluate the performance of our pretrained models against other existing biomedical language models in the literature. Our results show that we achieve state-of-the-art results on several biomedical domain tasks despite using similar or less computational cost compared to other models in the literature. Our findings highlight the significant effect of design choices on improving the performance of biomedical language models.", } ```