sultan commited on
Commit
b8ff32d
1 Parent(s): 8c7e5f6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # BioM-Transformers: Building Large Biomedical Language Models with BERT, ALBERT and ELECTRA
2
+
3
+ # Abstract
4
+
5
+
6
+ The impact of design choices on the performance
7
+ of biomedical language models recently
8
+ has been a subject for investigation. In
9
+ this paper, we empirically study biomedical
10
+ domain adaptation with large transformer models
11
+ using different design choices. We evaluate
12
+ the performance of our pretrained models
13
+ against other existing biomedical language
14
+ models in the literature. Our results show that
15
+ we achieve state-of-the-art results on several
16
+ biomedical domain tasks despite using similar
17
+ or less computational cost compared to other
18
+ models in the literature. Our findings highlight
19
+ the significant effect of design choices on
20
+ improving the performance of biomedical language
21
+ models.
22
+
23
+ # Model Description
24
+
25
+ This model was pre-trained with ELECTRA implementation of BERT that omit Next Sentence Prediction and introduce Dynamic Masking Loss Function instead of ELECTRA function. Since the model uses ELECTRA implementation of BERT, the architecture of the model in huggingface library is indeed ELECTRA. This model was pre-trained on TPUv3-512 for 690K steps with batch size of 4,192 on both PubMed Abstracts and PMC full article + general domain vocab (EN Wiki + Books). This design choice help this model achieving State-of-the-art on certain Bio Text Classification Tasks such as ChemProt.
26
+ . In order to help researchers with limited resources to fine-tune larger models, we created an example with PyTorch XLA. PyTorch XLA (https://github.com/pytorch/xla) is a library that allows you to use PyTorch on TPU units, which is provided for free by Google Colab and Kaggle. Follow this example to work with PyTorch/XLA [Link](https://github.com/salrowili/BioM-Transformers/blob/main/examples/Fine_Tuning_Biomedical_Models_on_Text_Classification_Task_With_HuggingFace_Transformers_and_PyTorch_XLA.ipynb). In this example we achieve 80.74 micro F1 score on ChemProt task with BioM-ALBERTxxlarge . Fine-tuning takes 43 minutes for 5 epochs .
27
+
28
+ Check our GitHub repo at https://github.com/salrowili/BioM-Transformers for TensorFlow and GluonNLP checkpoints. We also updated this repo with a couple of examples on how to fine-tune LMs on text classification and questions answering tasks such as ChemProt, SQuAD, and BioASQ.
29
+
30
+ # Colab Notebook Examples
31
+
32
+
33
+ BioM-ELECTRA-LARGE on NER and ChemProt Task [![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/BioM-Transformers/blob/main/examples/Example_of_NER_and_ChemProt_Task_on_TPU.ipynb)
34
+
35
+ BioM-ELECTRA-Large on SQuAD2.0 and BioASQ7B Factoid tasks [![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/BioM-Transformers/blob/main/examples/Example_of_SQuAD2_0_and_BioASQ7B_tasks_with_BioM_ELECTRA_Large_on_TPU.ipynb)
36
+
37
+ BioM-ALBERT-xxlarge on SQuAD2.0 and BioASQ7B Factoid tasks [![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/BioM-Transformers/blob/main/examples/Example_of_SQuAD2_0_and_BioASQ7B_tasks_with_BioM_ALBERT_xxlarge_on_TPU.ipynb)
38
+
39
+ Text Classification Task With HuggingFace Transformers and PyTorchXLA on Free TPU [![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/BioM-Transformers/blob/main/examples/Fine_Tuning_Biomedical_Models_on_Text_Classification_Task_With_HuggingFace_Transformers_and_PyTorch_XLA.ipynb)
40
+
41
+
42
+
43
+ [COLAB]: https://colab.research.google.com/assets/colab-badge.svg
44
+ # Acknowledgment
45
+
46
+ We would like to acknowledge the support we have from Tensorflow Research Cloud (TFRC) team to grant us access to TPUv3 units.
47
+
48
+ # Citation
49
+
50
+ ```bibtex
51
+ @inproceedings{alrowili-shanker-2021-biom,
52
+ title = "{B}io{M}-Transformers: Building Large Biomedical Language Models with {BERT}, {ALBERT} and {ELECTRA}",
53
+ author = "Alrowili, Sultan and
54
+ Shanker, Vijay",
55
+ booktitle = "Proceedings of the 20th Workshop on Biomedical Language Processing",
56
+ month = jun,
57
+ year = "2021",
58
+ address = "Online",
59
+ publisher = "Association for Computational Linguistics",
60
+ url = "https://www.aclweb.org/anthology/2021.bionlp-1.24",
61
+ pages = "221--227",
62
+ abstract = "The impact of design choices on the performance of biomedical language models recently has been a subject for investigation. In this paper, we empirically study biomedical domain adaptation with large transformer models using different design choices. We evaluate the performance of our pretrained models against other existing biomedical language models in the literature. Our results show that we achieve state-of-the-art results on several biomedical domain tasks despite using similar or less computational cost compared to other models in the literature. Our findings highlight the significant effect of design choices on improving the performance of biomedical language models.",
63
+ }
64
+ ```