You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

DentVLM: A Multimodal Vision Language Model for Comprehensive Dental Diagnosis and Enhanced Clinical Practice

DentVLM is a multimodal vision-language model designed for dental image understanding and diagnosis-oriented question answering. It supports dental image-question inputs for tasks such as malocclusion recognition, dental disease recognition, and region-aware dental image analysis.

This model is released for non-commercial academic and research use only. Model outputs should be interpreted as research-oriented assistance and should not be used as the sole basis for clinical diagnosis, treatment planning, or other medical decisions.

Model Access

The DentVLM model weights are provided through Hugging Face gated access.

To request access, please submit an access request on this model page and agree to the terms of use. Access requests are reviewed manually by the authors.

For access-related questions, please contact Z.L. (zuozhuliu@intl.zju.edu.cn).

Intended Use

DentVLM is intended for:

Academic research on dental multimodal vision-language modeling
Dental image understanding and question answering research
Reproduction and extension of the DentVLM training, inference, and evaluation pipeline
Benchmarking on dental multimodal tasks

Out-of-Scope Use

DentVLM is not intended for:

Use as the sole basis for clinical diagnosis or treatment decisions
Emergency medical or dental decision-making
Commercial use without separate written permission
Redistribution, re-hosting, mirroring, resale, or sublicensing of the model weights
Unlawful, harmful, privacy-invasive, or unethical applications

GitHub Repository

The source code, training scripts, inference scripts, evaluation scripts, and example data format are available at:

https://github.com/ZJUI-AI4H/DentVLM

Limitations

DentVLM is developed as a research model to support dental image understanding and diagnosis-oriented question answering. Its outputs should be interpreted in the context of professional expertise and task-specific evaluation.
Model performance may vary with image quality, imaging modality, acquisition conditions, and prompt formulation.
For applications involving new clinical environments, imaging protocols, or patient populations, users are encouraged to conduct appropriate validation before downstream use.
DentVLM is intended to assist research and evaluation workflows, not to replace professional dental or medical judgment.

Ethical Considerations

Users should ensure that all dental images and associated data are collected, processed, and used in compliance with applicable privacy, consent, institutional review, and data protection requirements.

The model should not be used for automated clinical decision-making without appropriate validation, oversight, and regulatory approval.

Acknowledgements

This project builds on:

LLaMA-Factory
Qwen2-VL
vLLM

We thank the authors and contributors of these projects.

Downloads last month: -

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ZJU-AI4H/DentVLM

Base model

Qwen/Qwen2-VL-7B

Finetuned

Qwen/Qwen2-VL-7B-Instruct

Finetuned

(602)

this model