--- language: "en" tags: - buy-intent - sell-intent - consumer-intent widget: - text: "Coronavirus disease (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus." --- Chemical domain language model finetuned on 13K Chemical, and 14K Pharma Wikipedia articles broken into paragraphs. # Chemical vs Pharmaceutical Domain Document Classifier | Train Loss | Validation Acc. | Test Acc.| | ------------- |:-------------: | -----: | | 0.17 | 0.928 | 0.927 | # Dataset Dataset with splits can be found @ [https://www.kaggle.com/shahrukhkhan/pharma-vs-chemicals-domain-classification](https://www.kaggle.com/shahrukhkhan/pharma-vs-chemicals-domain-classification) # Label Mappings LABEL_0 => **"CHEMICAL"**
LABEL_1 => **"PHARMACEUTICAL"** ## Usage in Transformers ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("recobo/chemical-bert-uncased-pharmaceutical-chemical-classifier") model = AutoModelForSequenceClassification.from_pretrained("recobo/chemical-bert-uncased-pharmaceutical-chemical-classifier") ```