library_name: transformers
tags:
- biology
- medical
zebra-Llama/zebra-Llama-v0.2
Zebra-Llama v0.2 is a specialized version of the Llama-3.1-8b-instruct model, fine-tuned with data specific to the rare disease Ehlers-Danlos Syndrome (EDS) - a rare connective tissue disorder. We utilized textual information from over 4,000 EDS papers from PubMed, more than 8,000 Reddit posts about EDS, and over 5,000 posts from the Inspire forum to gather real-world concerns/questions related to EDS, which were used to fine-tune the model. As a result, this model is adept at providing accurate responses to questions regarding EDS.
The model is trained using a specialized approach called "context-aware training," where we provided context for each question from a custom vector database during the training phase. This approach enabled the model to demonstrate high precision and recall during the inference phase when utilizing the RAG context. Additionally, the model showed a higher likelihood of generating correct citations compared to the base model.
What is new in this version of zebra-Llama?
Compared to the previous version (zebraLLAMA/zebra-Llama-v0.1), the latest Zebra-Llama model delivers more comprehensive and in-depth explanations for questions about the rare disease Ehlers-Danlos Syndrome.
The latest version has a greater ability to provide citations consistently compared to the previous version.
In addition to improved citation ability, it has also been benchmarked against the base model (meta-llama/Llama-3.1-8B-Instruct) and demonstrates superior text generation capabilities in terms of thoroughness, accuracy, and clarity, based on expert evaluation.
From a modeling perspective, the latest version utilizes "meta-llama/Llama-3.1-8B-Instruct" as its base model, while the earlier version (v0.1) was built on "meta-llama/Meta-Llama-3-8B-Instruct".
Model Details
Base model : meta-llama/Llama-3.1-8B-Instruct
Model Sources
Repository: https://github.com/karthiksoman/zebra-Llama
Custom built RAG API for rare diseases (focused on EDS):
• Base URL: https://zebra-llama-rag.onrender.com
• Endpoint: /search
Jupyter Notebook Demo of Zebra-Llama:
https://github.com/karthiksoman/zebra-Llama/blob/main/code/notebook/zebra_llama_v0.2_demo.ipynb
Uses
Zebra-Llama can be used to generate answers related to EDS questions.
Out-of-Scope Use
This Language Model is intended for academic and research purposes only. It is not for clinical use or medical decision-making. Consult a healthcare professional for medical advice.
Training Details
Fine tuning method : LoRA
LoRA rank : 16
LoRA alpha : 16
LORA dropout : 0.01
LORA target modules : ["q_proj", "k_proj", "v_proj"]
Train epochs : 2
Learning rate : 1e-4
LR scheduler type : constant
Max grad norm : 1
BATCH_SIZE_PER_GPU_FOR_TRAINING : 2
GRADIENT_ACCUMULATION_STEPS : 1
Citation
@misc{soman2024zebrallamacontextawarelargelanguage,
title={Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge},
author={Karthik Soman and Andrew Langdon and Catalina Villouta and Chinmay Agrawal and Lashaw Salta and Braian Peetoom and Gianmarco Bellucci and Orion J Buske},
year={2024},
eprint={2411.02657},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2411.02657},
}
Contact
Dr. Karthik Soman - karthi.soman@gmail.com
Andrew Langdon - andrewlngdn@gmail.com
Chinmay Agrawal - chag7212@colorado.edu
Catalina Villouta - catalina.villouta.r@gmail.com
Dr. Orion Buske - orion@phenotips.com
Lashaw Salta - lashawsalta@gmail.com