medclassify-ai

DistilBERT fine-tuned on the PubMed 200k RCT dataset for structural classification of medical abstract sentences.

Given a sentence from a clinical abstract, the model predicts one of five structural roles: BACKGROUND, OBJECTIVE, METHODS, RESULTS, or CONCLUSIONS.

Model details


Base model	`distilbert-base-uncased`
Task	5-class text classification
Parameters	67M
Max input length	128 tokens
Dataset	PubMed 200k RCT
Training framework	HuggingFace Transformers `Trainer` API
Author	Mohammed Suhail Ahmed Khan — GitHub

Label mapping

ID	Label
0	BACKGROUND
1	CONCLUSIONS
2	METHODS
3	OBJECTIVE
4	RESULTS

Quick start

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="SuhailKhan06/medclassify-ai"
)

sentences = [
    "Patients were randomly assigned to two treatment groups.",
    "The aim of this study was to evaluate the safety of drug X.",
    "These findings suggest the intervention is effective.",
    "Cardiovascular disease is a leading cause of death.",
    "The treatment significantly improved 30-day survival rates."
]

for s in sentences:
    print(classifier(s))

Training data

PubMed 200k RCT (source) — sentences extracted from PubMed abstracts of randomized controlled trials, labeled with their structural role.

Split	Sentences
Train	176,642
Validation	29,672
Test	29,578

Baseline comparison

Before fine-tuning, a TF-IDF (50k features, unigram + bigram) + Logistic Regression baseline was trained and evaluated on the same splits.

Model	Test accuracy	Weighted F1
TF-IDF + Logistic Regression	77.55%	77.10%
DistilBERT (this model)	checkpoint saved — full eval pending

DistilBERT training was interrupted before full convergence. The saved checkpoint is available and full evaluation metrics will be added once training completes.

Limitations

Trained on PubMed abstracts from randomized controlled trials. Performance on other abstract types (observational studies, case reports, reviews) is untested and likely lower.
English-only.
Short sentences (under 128 tokens). Very long sentences will be truncated.
The BACKGROUND and OBJECTIVE classes are the most confused by this model — they are structurally and lexically similar, and the baseline shows this clearly (F1 of 0.56 and 0.55 respectively).

Citation

If you use this model or the PubMed 200k RCT dataset, please cite the original dataset paper:

Dernoncourt, F., & Lee, J. Y. (2017).
PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts.
arXiv:1710.06071

Downloads last month: 5

Safetensors

Model size

67M params

Tensor type

F32

Model tree for SuhailKhan06/medclassify-ai

Base model

distilbert/distilbert-base-uncased

Finetuned

(11885)

this model

Dataset used to train SuhailKhan06/medclassify-ai

Space using SuhailKhan06/medclassify-ai 1

Paper for SuhailKhan06/medclassify-ai

PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts

Paper • 1710.06071 • Published Oct 17, 2017 • 1