|
Introducing **JiviMed-8B_v1**: The Cutting-Edge Biomedical Language Model |
|
|
|
JiviMed-8B stands as a pinnacle in language modeling tailored specifically for the biomedical sector. Developed by Jivi AI , this model incorporates the latest advancements to deliver unparalleled performance across a wide range of biomedical applications. |
|
|
|
Tailored for Medicine: JiviMed-8B is meticulously designed to cater to the specialized language and knowledge requirements of the medical and life sciences industries. It has been fine-tuned using an extensive collection of high-quality biomedical data, enhancing its ability to accurately comprehend and generate domain-specific text. |
|
|
|
Unmatched Performance: With 8 billion parameters, JiviMed-8B outperforms other open-source biomedical language models of similar size. It demonstrates superior results over larger models, both proprietary and open-source, such as GPT-3.5, Meditron-70B, and Gemini 1.0, in various biomedical benchmarks. |
|
|
|
Enhanced Training Methodologies: JiviMed-8B builds upon the robust frameworks of the Meta-Llama-3-8B models, integrating the ORPO dataset and a refined fine-tuning strategy, along with a specially curated diverse medical instruction dataset. Key elements of the training process include: |
|
|
|
1. Intensive Data Preparation: Over 100,000+ data points have been meticulously curated to ensure the model is well-versed in the nuances of biomedical language. |
|
2. Hyperparameter Tuning: Hyperparameter adjustments are carefully optimized to enhance learning efficiency without encountering catastrophic forgetting, thus maintaining robust performance across tasks. |
|
|
|
JiviMed-8B redefines what's possible in biomedical language modeling, setting new standards for accuracy, versatility, and performance in the medical domain. |
|
|
|
|
|
## Model Comparison |
|
|
|
| Model Name | Average | MedMCQA | MedQA | MMLU Anatomy | MMLU Clinical Knowledge | MMLU College Biology | MMLU College Medicine | MMLU Medical Genetics | MMLU Professional Medicine | PubMedQA | |
|
|----------------------------------------------------|---------|---------|-------|--------------|------------------------|----------------------|-----------------------|------------------------|------------------------------|----------| |
|
| **Jivi_medium_med_v1** | 75.53 | 60.1 | 60.04 | 77.04 | 82.26 | 86.81 | 73.41 | 86 | 80.08 | 72.6 | |
|
| Flan:PaLM | 74.7 | 57.6 | 67.6 | 63.7 | 80.4 | 88.9 | 76.3 | 75 | 83.8 | 79 | |
|
| winninghealth/WiNGPT2-Llama-3-8B-Base | 72.1 | 55.65 | 67.87 | 69.63 | 75.09 | 78.47 | 65.9 | 84 | 78.68 | 73.6 | |
|
| meta-llama/Meta-Llama-3-8B | 69.9 | 57.47 | 59.7 | 68.89 | 74.72 | 78.47 | 61.85 | 83 | 70.22 | 74.8 | |
|
| meta-llama/Meta-Llama-3-8B | 69.81 | 57.69 | 60.02 | 68.89 | 74.72 | 78.47 | 60.12 | 83 | 70.22 | 75.2 | |
|
| unsloth/gemma-7b | 64.18 | 48.96 | 47.21 | 59.26 | 69.81 | 79.86 | 60.12 | 70 | 66.18 | 76.2 | |
|
| mistralai/Mistral-7B-V9.1 | 62.85 | 48.2 | 50.82 | 55.56 | 68.68 | 68.06 | 59.54 | 71 | 68.38 | 75.4 | |
|
| BioMistral/BioMistral-7B-Zephyr-Beta-SLeRP | 61.52 | 46.52 | 50.2 | 55.56 | 63.02 | 65.28 | 61.27 | 72 | 63.24 | 76.6 | |
|
| BioMistral/BioMistral-7B-SLERP | 59.58 | 44.13 | 47.29 | 51.85 | 66.42 | 65.28 | 58.96 | 69 | 55.88 | 77.4 | |
|
| BioMistral/BioMistral-7B-DARE | 59.45 | 44.66 | 47.37 | 53.33 | 66.42 | 62.5 | 58.96 | 68 | 56.25 | 77.6 | |
|
| OpenModel s4all/gemma-1-7b-it | 58.37 | 44.56 | 45.01 | 52.59 | 62.64 | 68.75 | 57.23 | 67 | 55.15 | 72.4 | |
|
| medalpaca/medalpaca-7b | 58.03 | 37.51 | 41.71 | 57.04 | 57.36 | 65.28 | 54.34 | 69 | 67.28 | 72.8 | |
|
| BioMistral/BioMistral-7B | 56.36 | 41.48 | 46.11 | 51.11 | 63.77 | 61.11 | 53.76 | 66 | 52.94 | 71 | |
|
|
|
|
|
![model_accuracy](https://cdn-uploads.huggingface.co/production/uploads/65d31285220242a508a30523/sBHSX5Z0n0V1jTUpAxzX8.png) |
|
|
|
|
|
<details> |
|
|
|
<summary>Hyperparametes:</summary> |
|
|
|
Peft |
|
* lora_r: 64 |
|
* lora_alpha: 128 |
|
* lora_dropout: 0.05 |
|
* lora_target_linear: true |
|
|
|
Target_Modules |
|
* q_proj |
|
* v_proj |
|
* k_proj |
|
* o_proj |
|
* gate_proj |
|
* down_proj |
|
* up_proj |
|
</details> |
|
|
|
|