yirmibesogluz/t2t-ner-ade-balanced

t2t-ner-ade-balanced

t2t-ner-ade-balanced is a text-to-text (t2t) adverse drug event (ade) extraction (NER) model trained with over- and undersampled (balanced) English tweets reporting adverse drug events. It is trained as part of BOUN-TABI system for the Social Media Mining for Health (SMM4H) 2022 shared task. The system description paper has been accepted for publication in Proceedings of the Seventh Social Media Mining for Health (#SMM4H) Workshop and Shared Task and will be available soon. The source code has been released on GitHub at https://github.com/gokceuludogan/boun-tabi-smm4h22.

The model utilizes the T5 model and its text-to-text formulation. The inputs are fed to the model with the task prefix "ner ade:", followed with a sentence/tweet. In turn, either the extracted adverse event span is returned, or "none".

Requirements

sentencepiece
transformers

Usage

from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("yirmibesogluz/t2t-ner-ade-balanced")
model = AutoModelForSeq2SeqLM.from_pretrained("yirmibesogluz/t2t-ner-ade-balanced")
predictor = pipeline("text2text-generation", model=model, tokenizer=tokenizer)
predictor("ner ade: i'm so irritable when my vyvanse wears off")

Citation

@inproceedings{uludogan-gokce-yirmibesoglu-zeynep-2022-boun-tabi-smm4h22,
    title = "{BOUN}-{TABI}@{SMM4H}'22: Text-to-{T}ext {A}dverse {D}rug {E}vent {E}xtraction with {D}ata {B}alancing and {P}rompting",
    author = "Uludo{\u{g}}an, G{\"{o}}k{\c{c}}e  and Yirmibe{\c{s}}o{\u{g}}lu, Zeynep",
    booktitle = "Proceedings of the Seventh Social Media Mining for Health ({\#}SMM4H) Workshop and Shared Task",
    year = "2022",
}