Instructions to use AmareshHebbar/medical-ai-model-suite with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- Unsloth Studio
How to use AmareshHebbar/medical-ai-model-suite with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AmareshHebbar/medical-ai-model-suite to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AmareshHebbar/medical-ai-model-suite to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for AmareshHebbar/medical-ai-model-suite to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="AmareshHebbar/medical-ai-model-suite", max_seq_length=2048, )
Medical AI Fine-tuned Model Suite
A collection of 16 Qwen2.5 models fine-tuned with QLoRA, one per medical/healthcare task β ICD-10 coding, billing, clinical documentation, India's PM-JAY scheme, and more. Built by Amaresh Hebbar.
Collection page: [link to your HF collection here]
Every model uses the same approach: a real public data source (CMS, NHA India, peer-reviewed biomedical corpora β no synthetic or LLM-generated training data), QLoRA fine-tuning on Qwen2.5, and a narrow, well-defined task with a strict system prompt. The goal is a set of small, deployable specialists rather than one large general-purpose medical model.
Why specialist models instead of one big model
A single general medical LLM has to be evaluated, monitored, and trusted across every task it might be asked to do. These models are scoped to one job each β ICD-10 coding, PM-JAY classification, radiology report coding β so each one is small enough to self-host cheaply, easy to evaluate against a clear ground truth, and safe to swap out independently if a better version ships later.
The 16 models
| Model | Task | Base | Rows |
|---|---|---|---|
| icd10-coder-qwen25-7b | ICD-10-CM medical coding | Qwen2.5-7B | 74,719 |
| snomed-mapper-qwen25-7b | Clinical terminology mapping | Qwen2.5-7B | 74,719 |
| clinical-summarizer-qwen25-7b | SOAP note summarization | Qwen2.5-7B | 30,000 |
| symptom-diagnoser-qwen25-7b | Symptom β differential diagnosis | Qwen2.5-7B | 119,467 |
| discharge-qa-qwen25-3b | Discharge summary Q&A | Qwen2.5-3B | 30,000 |
| radiology-coder-qwen25-3b | Radiology report coding | Qwen2.5-3B | 25,090 |
| medical-ner-qwen25-3b | Clinical named entity recognition | Qwen2.5-3B | 16,671 |
| hindi-medical-qwen25-3b | Hindi medical reasoning | Qwen2.5-3B | 19,704 |
| cpt-coder-qwen25-3b | CPT/HCPCS procedure coding | Qwen2.5-3B | 17,029 |
| medical-billing-qwen25-3b | Medical billing assistant | Qwen2.5-3B | 17,029 |
| pmjay-classifier-qwen25-3b | India PM-JAY package classification | Qwen2.5-3B | 11,140 |
| pharmacy-ner-qwen25-1b | Drug entity extraction | Qwen2.5-1.5B | 3,500 |
| ayurveda-icd-qwen25-1b | Ayurveda to ICD-10 bridge | Qwen2.5-1.5B | 3,002 |
| insurance-classifier-qwen25-1b | Stark Law DHS compliance | Qwen2.5-1.5B | 1,601 |
| icd10-to-drg-qwen25-1b | ICD-10 β MS-DRG reimbursement | Qwen2.5-1.5B | 5,385 |
| loinc-coder-qwen25-1b | Lab test CPT coding | Qwen2.5-1.5B | 2,179 |
Training method
All 16 models share the same recipe:
| Fine-tuning method | QLoRA, 4-bit NF4 quantization, rank 16, alpha 32 |
| Training framework | Unsloth + TRL SFTTrainer |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Hardware | Single NVIDIA A40 (48GB) |
| Optimizer | paged_adamw_8bit, cosine LR schedule |
| Experiment tracking | Weights & Biases |
Model size is matched to dataset size β large datasets (>50k rows) get Qwen2.5-7B, mid-size (10kβ50k) get Qwen2.5-3B, and smaller specialist datasets (<10k) get Qwen2.5-1.5B. This keeps inference cost proportional to task complexity instead of running every task through the same large model.
Data sources
Every dataset behind these models is built from real authoritative public data β CMS (ICD-10-CM, MS-DRG, Physician Fee Schedule, HCPCS), NHA India (PM-JAY HBP 2022, PM RAHAT), and peer-reviewed biomedical corpora (chat_doctor, augmented-clinical-notes, drugprot). No synthetic or LLM-generated training data. Full extraction pipelines and column-level provenance are documented on each dataset card.
How to use any model in this suite
Each model is a LoRA adapter on top of its base Qwen2.5 model. Load with PEFT:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model = "unsloth/Qwen2.5-7B-Instruct" # match the base size for the model you're using
adapter = "AmareshHebbar/icd10-coder-qwen25-7b"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
See each model's individual card for its exact system prompt, example input/output, and recommended serving setup (Transformers, Unsloth, or vLLM).
Limitations
These are narrow specialist models, not general medical assistants. Each model only performs the single task it was trained on β using it outside that task will produce unreliable output. None of these models are a substitute for a licensed medical or billing professional; all output should be reviewed by a qualified person before being used in a clinical, billing, or compliance decision.
Citation
@misc{medicalai2026,
author = {Hebbar, Amaresh},
title = {Medical AI Fine-tuning Suite},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/AmareshHebbar}
}
Contact
- GitHub: amareshhebbar
- LinkedIn: gvamaresh
- HuggingFace: AmareshHebbar