shahrukhx01's picture
Update README.md
3d43a5f
metadata
language: en
tags:
  - buy-intent
  - sell-intent
  - consumer-intent
widget:
  - text: >-
      Coronavirus disease (COVID-19) is an infectious disease caused by the
      SARS-CoV-2 virus.

Chemical domain language model finetuned on 13K Chemical, and 14K Pharma Wikipedia articles broken into paragraphs.

Chemical vs Pharmaceutical Domain Document Classifier

Train Loss Validation Acc. Test Acc.
0.17 0.928 0.927

Dataset

Dataset with splits can be found @ https://www.kaggle.com/shahrukhkhan/pharma-vs-chemicals-domain-classification

Label Mappings

LABEL_0 => "CHEMICAL"
LABEL_1 => "PHARMACEUTICAL"

Usage in Transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification
  
tokenizer = AutoTokenizer.from_pretrained("recobo/chemical-bert-uncased-pharmaceutical-chemical-classifier")

model = AutoModelForSequenceClassification.from_pretrained("recobo/chemical-bert-uncased-pharmaceutical-chemical-classifier")