YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
---
language: en
license: mit
base_model: meta-llama/Llama-3.2-1B-Instruct
tags:
- medical
- india
- healthcare
- llama
- text-generation
- indian-healthcare
- mental-health
- merged
pipeline_tag: text-generation
---
# MedQuery-India-v1 (Merged)
**No Meta approval or Hugging Face login required to use this model!**
This is the **merged, standalone version** of the original `MedQuery-India-v1` QLoRA adapter. It is a fine-tuned version of **Llama-3.2-1B-Instruct** for Indian medical question answering β covering AIIMS/NEET clinical protocols, Indian drug brands (Crocin, Dolo, Combiflam), regional diseases (dengue, typhoid, TB/DOTS, chikungunya), national health programs (NTEP, NVBDCP, RSSDI, IAP), and mental health support with cultural sensitivity.
> *Why this exists:* Most open-source medical AI models are trained on PubMed and USMLE data β optimized for Western clinical contexts. Indian patients ask about Dolo 650, not acetaminophen. They ask about DOTS, not generic TB regimens. This model is trained to understand that gap.
---
## β‘ Quick Start β One Cell, Any Notebook
Open in **Google Colab** (Runtime β Change runtime type β **T4 GPU**) or any Kaggle notebook and paste this single cell.
Since this is a merged model, it loads natively with standard `transformers`. Just change `QUESTION` to anything you want to ask!
```python
# ============================================================
# MedQuery-India-v1 (Merged) β Direct Inference
# Works on Google Colab / Kaggle / any notebook with a T4 GPU
# ============================================================
# --- Step 1: Install basic dependencies ---
import subprocess
subprocess.run(["pip", "install", "-q", "transformers", "torch", "accelerate"], check=True)
# --- Step 2: Load the model directly ---
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
MODEL_ID = "kanha98/medquery-india-v1-merged"
print("Downloading and loading the model (approx 2.5 GB)...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
tokenizer.pad_token = tokenizer.eos_token
# Loading in float16 - Easily fits in free Colab T4 (15GB VRAM)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.float16,
device_map="auto"
)
print("β
Model loaded successfully!")
# --- Step 3: Ask your question β change this line β ---
QUESTION = "What are the warning signs of severe dengue?"
# -------------------------------------------------------
SYSTEM = (
"You are MedQuery-India, a medical AI assistant trained on Indian healthcare context "
"including AIIMS/NEET clinical protocols, Indian drug brands, regional diseases, "
"Indian procedural guidelines (NTEP, NVBDCP, RSSDI, IAP), and mental health support. "
"Answer accurately, safely, and with cultural sensitivity."
)
prompt = (
f"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n{SYSTEM}<|eot_id|>"
f"<|start_header_id|>user<|end_header_id|>\n{QUESTION}<|eot_id|>"
f"<|start_header_id|>assistant<|end_header_id|>\n"
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=250,
temperature=0.3,
do_sample=True,
repetition_penalty=1.1,
pad_token_id=tokenizer.eos_token_id,
)
print("\nπ©Ί --- MedQuery-India Response ---")
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("assistant")[-1].strip())
Model Details
| Property | Value |
|---|---|
| Model Format | Merged Standalone (FP16/FP32) |
| Base model | meta-llama/Llama-3.2-1B-Instruct |
| Parameters | 1,235,814,400 (1.24B) |
| Original Fine-tuning | QLoRA (4-bit NF4 quantization) |
| Original LoRA rank | r = 64 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj (7 modules) |
| Training hardware | Tesla T4 (Kaggle, 14.5GB VRAM) |
| Final training loss | 1.5468 |
(Note: The adapter weights from the original fine-tuning have been permanently merged into the base model's weights for ease of use.)
Why These Architecture Decisions
Why Llama-3.2-1B-Instruct?
- Tokenizer efficiency on medical vocabulary. Llama-3's 128k BPE vocabulary encodes medical terms like "acetaminophen", "thrombocytopenia", and "leptospirosis" as 1β2 tokens. GPT-2's 50k vocabulary splits the same terms into 4β6 tokens. Fewer tokens per medical term means the model sees more semantic context within the 512-token window.
- Grouped Query Attention (GQA). Llama-3.2 uses GQA with an 8:1 ratio. This reduces KV cache memory significantly compared to standard multi-head attention, enabling longer context at the same VRAM cost.
- The 1B sweet spot. Larger than SmolLM2-360M (better reasoning, longer coherent answers), smaller than 3B+ (fits T4 easily). Every architectural decision in this model is explainable.
Dataset Overview
Total training samples: 6,569 | Val: 780 | Test: 780
| Source | Samples | % | Why included |
|---|---|---|---|
| MedMCQA (Indian) | 3,613 | 55.0% | AIIMS/NEET exam questions β directly Indian clinical context |
| ChatDoctor | 1,588 | 24.2% | Real patient-doctor conversations β teaches conversational tone |
| MedQuAD | 802 | 12.2% | NIH structured QA β adds reliable factual grounding |
| PubMedQA | 237 | 3.6% | Expert-annotated research QA β adds clinical reasoning |
| Synthetic Indian (general) | 144 | 2.2% | Indian drug names, regional disease context |
| Synthetic Indian (edge cases) | 135 | 2.1% | Drug safety edge cases specific to India |
| Synthetic Mental Health | 50 | 0.8% | NEET stress, exam pressure, Indian mental health context |
Why MedMCQA at 55%? It is the only large-scale dataset sourced directly from AIIMS and NEET PG entrance exams β real Indian clinical questions with expert explanations. Why synthetic data? Public medical datasets have near-zero coverage of Indian drug brands, DOTS protocol specifics, or mental health in the Indian exam context. Hand-crafted synthetic samples fill this gap directly.
Sample Outputs
Q: What are the warning signs of severe dengue?
Warning signs of severe dengue require immediate hospitalization. These include: extreme lethargy or inability to talk; persistent vomiting; severe abdominal pain; bleeding from the nose, gums, or mouth; very high fever above 40Β°C; black spots on the skin (petechiae); swelling in the face, neck, or groin; extremely low blood pressure. If you have any one of these symptoms, call your local hospital immediately.
Q: Patient took Combiflam for dengue fever. Is this dangerous?
Combiflam contains ibuprofen (an NSAID) and paracetamol. Taking NSAIDs during dengue is dangerous because they increase the risk of bleeding by inhibiting platelet function β already compromised in dengue. For dengue fever, only paracetamol (like Crocin or Dolo 650) is safe. NSAIDs including Combiflam, Brufen, and Aspirin must be strictly avoided.
Q: What is the DOTS program for tuberculosis in India?
DOTS (Directly Observed Treatment Short-course) is India's national TB control program under the National Tuberculosis Elimination Programme (NTEP), launched in 1992. Core components: sputum microscopy for diagnosis, a standardized short-course drug regimen, direct supervision of treatment, case detection at facility and community levels, and free treatment under the PM-JAY scheme. DOTS is implemented by state and district health departments under the Central TB Division, Ministry of Health and Family Welfare.
Limitations
- Not a substitute for medical advice. This model is for research and educational purposes. Do not use for clinical diagnosis or treatment decisions.
- English only (v1). Hindi, Marathi, and Bengali support are planned for v2.
- 1B parameter ceiling. Complex multi-step clinical reasoning may produce errors. Hallucination risk exists on rare diseases.
- Training data cutoff. Drug approvals, protocol updates, or guideline changes after the training data may not be reflected.
Citation
If you use this model in research, please cite:
@misc{gupta2025medqueryindia,
author = {Kanhayya Gupta},
title = {MedQuery-India-v1: Fine-Tuning of Llama-3.2-1B for Indian Medical QA},
year = {2026},
url = {[https://huggingface.co/kanha98/medquery-india-v1-merged](https://huggingface.co/kanha98/medquery-india-v1-merged)}
}
Author
Kanhayya Gupta
- GitHub: kanhaiya-98
- LinkedIn: kanhayya-gupta
- Downloads last month
- 23