Image-Text-to-Text
Transformers
Safetensors
qwen2_5_vl
feature-extraction
vision-language
medical
radiology
chest-xray
qwen2.5-vl
conversational
custom_code
text-generation-inference
Instructions to use EvidenceAIResearch/VReason-QwenVL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use EvidenceAIResearch/VReason-QwenVL with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="EvidenceAIResearch/VReason-QwenVL", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForVision2Seq processor = AutoProcessor.from_pretrained("EvidenceAIResearch/VReason-QwenVL", trust_remote_code=True) model = AutoModelForVision2Seq.from_pretrained("EvidenceAIResearch/VReason-QwenVL", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use EvidenceAIResearch/VReason-QwenVL with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "EvidenceAIResearch/VReason-QwenVL" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "EvidenceAIResearch/VReason-QwenVL", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/EvidenceAIResearch/VReason-QwenVL
- SGLang
How to use EvidenceAIResearch/VReason-QwenVL with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "EvidenceAIResearch/VReason-QwenVL" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "EvidenceAIResearch/VReason-QwenVL", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "EvidenceAIResearch/VReason-QwenVL" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "EvidenceAIResearch/VReason-QwenVL", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use EvidenceAIResearch/VReason-QwenVL with Docker Model Runner:
docker model run hf.co/EvidenceAIResearch/VReason-QwenVL
EvidenceAIResearch/VReason-QwenVL
VReason-QwenVL model checkpoint for chest X-ray visual reasoning and report generation.
What is included
- Model weights (
safetensorsshards) - Tokenizer and config files
generation_config.json- Built-in
model.visual_reason(...)method available viatrust_remote_code=True
Installation
pip install -r requirements.txt
pip install cxas-vreason
If cxas-vreason is not yet available in your environment, install from this repo:
pip install "git+https://huggingface.co/EvidenceAIResearch/VReason-QwenVL#subdirectory=cxas_vreason"
Quick start (Transformers)
import torch
from PIL import Image
from transformers import AutoProcessor, AutoModelForVision2Seq
repo_id = "EvidenceAIResearch/VReason-QwenVL"
processor = AutoProcessor.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForVision2Seq.from_pretrained(
repo_id,
torch_dtype=torch.float16,
trust_remote_code=True,
).eval().cuda()
image = Image.open("frontal.jpg").convert("RGB")
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": image},
{
"type": "text",
"text": "Based on the provided chest radiograph, explain your diagnosis procedure and write a report.",
},
],
}
]
prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[prompt], images=[[image]], return_tensors="pt").to(model.device)
output_ids = model.generate(**inputs, max_new_tokens=1024)
text = processor.batch_decode(output_ids, skip_special_tokens=False)[0]
print(text)
Visual reasoning method
After loading with trust_remote_code=True, the model exposes:
model.visual_reason(...)
This method can:
reasoning.jsonwith regions, sub-regions, and extracted reasoning text- Generate ROI image artifacts for anatomy/pathology tool calls (blur/crop/blurcrop)
Example:
out = model.visual_reason(
processor=processor,
image="frontal.jpg",
generate_roi=True,
output_dir="./visual_reason_out",
viz_mode="blurcrop",
)
print(out["report"])
Notes:
trust_remote_code=Trueis required to enablemodel.visual_reason(...).- Pass
generate_roi=Falsewhen you only need structured text parsing.
Limitations
- Intended for research use only.
- Not a medical device; outputs must not be used as sole clinical evidence.
- Performance can vary by data source and imaging protocol.
Citation
@unpublished{ye2026visual,
title={Visual Reasoning Enables Evidence-Grounded Radiology {AI}},
author={Ye, Shuchang and Robertson, Harry and Moghadam, Alireza
and Shu, Matthew and Harb, Nathan and Li, Jennifer
and Mogdil, Aadhar and Raythatha, Jineel and Shen, Yujia
and Song, Xinyun and Tan, Xinchen and Fu, Xiaolong
and Meng, Mingyuan and Bi, Lei and Yang, Jean YH
and Kim, Jinman},
year={2026},
}
- Downloads last month
- 17
Model tree for EvidenceAIResearch/VReason-QwenVL
Base model
Qwen/Qwen2.5-VL-7B-Instruct