ade-benchmark-corpus/ade_corpus_v2
Viewer • Updated • 30.6k • 3.99k • 35
This model is a fine-tuned version of dmis-lab/biobert-base-cased-v1.2 for binary sentence classification: Does a sentence describe an adverse drug effect (ADE)?
It was fine-tuned on the ADE Corpus V2 dataset and compared against a classical TF-IDF + Logistic Regression baseline as part of a broader project benchmarking classical vs. transformer approaches on imbalanced biomedical text.
Project Repo: GitHub
| Model | Weighted F1 | ADE Class F1 | Accuracy | Total Errors |
|---|---|---|---|---|
| TF-IDF + Logistic Regression | 0.90 | 0.84 | 90% | 349 |
| BioBERT (this model) | 0.96 | 0.93 | 96% | 145 |
BioBERT reduced misclassifications by 58% (349 → 145 errors) compared to the classical baseline.
dmis-lab/biobert-base-cased-v1.2 (110M parameters)| Epoch | Train Loss | Val F1 | Val Accuracy |
|---|---|---|---|
| 1 | 0.175 | 0.943 | 0.943 |
| 2 | 0.114 | 0.952 | 0.952 |
| 3 | 0.043 | 0.952 | 0.952 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("scheun/biobert-ade-classifier")
tokenizer = AutoTokenizer.from_pretrained("scheun/biobert-ade-classifier")
inputs = tokenizer("Patient developed severe nausea after taking the medication.", return_tensors="pt")
outputs = model(**inputs)
prediction = outputs.logits.argmax(-1).item()
print(prediction) # 0 = not ADE, 1 = ADE
Base model
dmis-lab/biobert-base-cased-v1.2