MadriMed-VL-2B-enc
A 2B-parameter medical vision-language model, trained for medical image understanding, radiology report assistance, clinical visual question answering, and medical text reasoning.
This release introduces Dynamic LoRA Scaling, a lightweight calibration technique that reduces adapter dominance while preserving the medical knowledge learned during fine-tuning. The objective is to improve reliability, reduce hallucinations, and mitigate common diagnostic confusion patterns observed in earlier releases of madrisight/MadriMed-VL-2B.
Important: Use bfloat16 for Inference
This model was trained and calibrated using bfloat16 (BF16) precision for best performance and reproducibility on Pytorch MPS )
🚀 Quick Start
Installation
pip install transformers torch Pillow
Example image
Run the model directly
import torch
import re
from transformers import AutoProcessor, Qwen3VLForConditionalGeneration
from PIL import Image
BASE_MODEL_ID = "madrisight/MadriMed-VL-2B-enc"
model = Qwen3VLForConditionalGeneration.from_pretrained(
BASE_MODEL_ID,
device_map="cuda",
trust_remote_code=True,
)
model.eval()
processor = AutoProcessor.from_pretrained(BASE_MODEL_ID, trust_remote_code=True)
def load_direct_image(path: str) -> Image.Image:
with Image.open(path) as raw:
img = raw.convert("RGB")
return img
# 5. Formulate the Query
prompt = """Choose the correct option for the question
Instructions:
- Analyze ONLY the provided image.
- Do NOT use external medical knowledge.
- Briefly explain the visual evidence relevant to the question.
Question:
Examine the mammogram image shown above. Which of the following findings is most evident?
Options
A. Well-circumscribed round mass with benign features
B. Clustered microcalcifications within an area of irregular density
C. Fat-containing lesion consistent with lipoma
D. Diffuse bilateral breast edema
"""
img = load_direct_image("/content/MM-1-a.png")
messages = [
{
"role": "system",
"content": "You are an expert medical AI. You must deeply analyze the question and provide the final answer."
},
{
"role": "user",
"content": [
{"type": "image"},
{"type": "text", "text": prompt}
]
}
]
stop_token_id = processor.tokenizer.convert_tokens_to_ids("<|im_end|>")
with torch.inference_mode():
text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = processor(
text=text,
images=[img],
return_tensors="pt",
).to("cuda")
generated_ids = model.generate(
**inputs,
max_new_tokens=1024, # tight control (prevents drift)
do_sample=False, # deterministic output
pad_token_id=processor.tokenizer.pad_token_id,
eos_token_id=stop_token_id
)
output_text = processor.batch_decode(
generated_ids[:, inputs.input_ids.shape[1]:],
skip_special_tokens=True
)[0]
print(output_text.strip())
So, let's analyze the mammogram. The image shows a breast with some irregularities.
Looking at the options: A is about a well-circumscribed round mass, but the image doesn't show a clear mass.
B mentions clustered microcalcifications in an irregular density area. In mammograms, microcalcifications are often seen as small white spots, and irregular density might be a pattern.
C is a fat-containing lesion like a lipoma, but the image doesn't show fat density.
D is bilateral breast edema, which isn't visible here.
So B seems to fit because microcalcifications are a key finding in mammograms, especially when clustered.
</think>
B. Clustered microcalcifications within an area of irregular density
🔬 Technical Details
Training Configuration
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen3-VL-2B-Thinking |
| Training data | medmax |
| Fine-tuning type | Lora SFT + GPRO |
| Precision | bfloat16 |
| Hardware | Single Mac Mini (M4 Pro) with TrlMPS (https://github.com/krrish-v/trlmps) |
🙏 Acknowledgments
- Base model: Qwen3-VL-2B-Instruct by Alibaba Cloud
- Training data: mint-medmax/medmax_data
📄 Citation
If you use this model in research, please cite:
@software{madrimedvl2b,
title = {MadriMed-VL-2B: A Compact Multimodal Medical Vision-Language Model},
author = {Madrisight},
year = {2026},
url = {https://huggingface.co/madrisight/MadriMed-VL-2B}
}
Disclaimer: This model is provided for research and educational purposes only. It is not FDA-approved, not clinically validated, and must not be used for patient care without expert human oversight. The authors assume no liability for clinical use.
- Downloads last month
- 228
