Safetensors

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

📃 Paper |🤗 MedReason-8B | 📚 MedReason Data

⚡Introduction

MedReason is a large-scale high-quality medical reasoning dataset designed to enable faithful and explainable medical problem-solving in large language models (LLMs).

  • We utilize a structured medical knowledge graph (KG) to convert clinical QA pairs into logical chains of reasoning, or “thinking paths”.
  • Our pipeline generates detailed reasoning for various medical questions from 7 medical datasets, resulting in a dataset of 32,682 question-answer pairs, each with detailed, step-by-step explanations.
  • By finetuning with proposed MedReason dataset, our best model MedReason-8B, achieves state-of-the-art performance.

We open-sourced our model here.

👨‍⚕️ Model

  • Model Access
Model Base Model Link
MedReason-8B HuatuoGPT-o1-8B Link
MedReason-Llama Llama-3.1-8B-Instruct Link
MedReason-Mistral Mistral-7B-Instruct-v0.2 Link
  • Deploy: we provide a example code for direct inference with MedReason-8B.

    Also, MedReason-8B can be deployed with tools like vllm or Sglang, we provide code for model deployment using Sglang in ./src/evaluation/eval.py

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained('UCSC-VLAA/MedReason-8B',torch_dtype="auto",device_map="auto", use_safetensors= True)
model.eval()

tokenizer = AutoTokenizer.from_pretrained('UCSC-VLAA/MedReason-8B', trust_remote_code=True, padding_side='left')

input_text = "How to stop a cough?"
messages = [{"role": "user", "content": input_text}]

inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False,add_generation_prompt=True), return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🙏🏼 Acknowledgement

We gratefully acknowledge the inspiring work of HuatuoGPT-o1, which laid important groundwork for this research. We also thank the developers of the excellent tools curator, trl, and sglang for making this work possible.

📖 Citation

@misc{wu2025medreasonelicitingfactualmedical,
      title={MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs}, 
      author={Juncheng Wu and Wenlong Deng and Xingxuan Li and Sheng Liu and Taomian Mi and Yifan Peng and Ziyang Xu and Yi Liu and Hyunjin Cho and Chang-In Choi and Yihan Cao and Hui Ren and Xiang Li and Xiaoxiao Li and Yuyin Zhou},
      year={2025},
      eprint={2504.00993},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2504.00993}, 
}
Downloads last month
41
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including UCSC-VLAA/MedReason-8B