Introducing JiviMed-8B_v1: The Cutting-Edge Biomedical Language Model
JiviMed-8B stands as a pinnacle in language modeling tailored specifically for the biomedical sector. Developed by Jivi AI , this model incorporates the latest advancements to deliver unparalleled performance across a wide range of biomedical applications.
Tailored for Medicine: JiviMed-8B is meticulously designed to cater to the specialized language and knowledge requirements of the medical and life sciences industries. It has been fine-tuned using an extensive collection of high-quality biomedical data, enhancing its ability to accurately comprehend and generate domain-specific text.
Unmatched Performance: With 8 billion parameters, JiviMed-8B outperforms other open-source biomedical language models of similar size. It demonstrates superior results over larger models, both proprietary and open-source, such as GPT-3.5, Meditron-70B, and Gemini 1.0, in various biomedical benchmarks.
Enhanced Training Methodologies: JiviMed-8B builds upon the robust frameworks of the Meta-Llama-3-8B models, integrating the ORPO dataset and a refined fine-tuning strategy, along with a specially curated diverse medical instruction dataset. Key elements of the training process include:
1. Intensive Data Preparation: Over 100,000+ data points have been meticulously curated to ensure the model is well-versed in the nuances of biomedical language.
2. Hyperparameter Tuning: Hyperparameter adjustments are carefully optimized to enhance learning efficiency without encountering catastrophic forgetting, thus maintaining robust performance across tasks.
JiviMed-8B redefines what's possible in biomedical language modeling, setting new standards for accuracy, versatility, and performance in the medical domain.
Model Comparison
Model Name | Average | MedMCQA | MedQA | MMLU Anatomy | MMLU Clinical Knowledge | MMLU College Biology | MMLU College Medicine | MMLU Medical Genetics | MMLU Professional Medicine | PubMedQA |
---|---|---|---|---|---|---|---|---|---|---|
Jivi_medium_med_v1 | 75.53 | 60.1 | 60.04 | 77.04 | 82.26 | 86.81 | 73.41 | 86 | 80.08 | 72.6 |
Flan:PaLM | 74.7 | 57.6 | 67.6 | 63.7 | 80.4 | 88.9 | 76.3 | 75 | 83.8 | 79 |
winninghealth/WiNGPT2-Llama-3-8B-Base | 72.1 | 55.65 | 67.87 | 69.63 | 75.09 | 78.47 | 65.9 | 84 | 78.68 | 73.6 |
meta-llama/Meta-Llama-3-8B | 69.9 | 57.47 | 59.7 | 68.89 | 74.72 | 78.47 | 61.85 | 83 | 70.22 | 74.8 |
meta-llama/Meta-Llama-3-8B | 69.81 | 57.69 | 60.02 | 68.89 | 74.72 | 78.47 | 60.12 | 83 | 70.22 | 75.2 |
unsloth/gemma-7b | 64.18 | 48.96 | 47.21 | 59.26 | 69.81 | 79.86 | 60.12 | 70 | 66.18 | 76.2 |
mistralai/Mistral-7B-V9.1 | 62.85 | 48.2 | 50.82 | 55.56 | 68.68 | 68.06 | 59.54 | 71 | 68.38 | 75.4 |
BioMistral/BioMistral-7B-Zephyr-Beta-SLeRP | 61.52 | 46.52 | 50.2 | 55.56 | 63.02 | 65.28 | 61.27 | 72 | 63.24 | 76.6 |
BioMistral/BioMistral-7B-SLERP | 59.58 | 44.13 | 47.29 | 51.85 | 66.42 | 65.28 | 58.96 | 69 | 55.88 | 77.4 |
BioMistral/BioMistral-7B-DARE | 59.45 | 44.66 | 47.37 | 53.33 | 66.42 | 62.5 | 58.96 | 68 | 56.25 | 77.6 |
OpenModel s4all/gemma-1-7b-it | 58.37 | 44.56 | 45.01 | 52.59 | 62.64 | 68.75 | 57.23 | 67 | 55.15 | 72.4 |
medalpaca/medalpaca-7b | 58.03 | 37.51 | 41.71 | 57.04 | 57.36 | 65.28 | 54.34 | 69 | 67.28 | 72.8 |
BioMistral/BioMistral-7B | 56.36 | 41.48 | 46.11 | 51.11 | 63.77 | 61.11 | 53.76 | 66 | 52.94 | 71 |
Hyperparametes:
Peft
- lora_r: 64
- lora_alpha: 128
- lora_dropout: 0.05
- lora_target_linear: true
Target_Modules
- q_proj
- v_proj
- k_proj
- o_proj
- gate_proj
- down_proj
- up_proj