This model is trained from the model: MonteXiaofeng/CareBot_Medical_multi-llama3-8b-base, training data is: BAAI/IndustryInstruction_Health-Medicine, To enhance the model's ability to follow medical instructions and better adapt to specific medical scenarios, we conduct the supervised fine-tuning. This process involves using conversational-style data (comprising both queries and responses) to finetune the pretrained LLM. In the following sections, we will explore the details of data construction and training methods.

Data Construction

Our SFT dataset comprises a diverse array of question types, including multiple-choice questions from medical exams, single-turn disease diagnoses, and multi-turn health consultations. It integrates data from seven publicly available sources: Chinese Medical Dialogue Data\footnote{https://github.com/Toyhom/Chinese-medical-dialogue-data}, Huatuo26M , MedDialog , ChatMed Consult Dataset , ChatDoctor , CMB\footnote{https://github.com/FreedomIntelligence/CMB}, and MedQA . We preserve portions of authentic doctor-patient conversations and augment the dataset by rewriting the remaining content. For these rewrites, we use real-world medical scenarios as prompts and generate responses via GPT-4. We believe this ensures the diversity of the SFT dataset, which can help the CareBot better adapt to different types of medical problems and patient situations, thereby improving its performance in a variety of scenarios.

evaluation

evaluation on benchmark is bellow. image/png

image/png

gsb result with other medical LLMS image/png

Downloads last month
17
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for MonteXiaofeng/CareBot_Medical_multi-llama3-8b-instruct

Finetuned
(1)
this model
Finetunes
1 model

Datasets used to train MonteXiaofeng/CareBot_Medical_multi-llama3-8b-instruct