Edit model card

Fine tuned over a 10K stratified sample of instruct-question-answer triplets gathered from the following sources:

  • Medical meadow flashcards
  • Medical meadow wikidocs
  • HealthcareMagic dataset
  • Medical meadow MedQA MCQs
  • Medinstruct dataset
  • Medquad dataset
  • iCliniq dataset
  • Medical meadow patient info dataset
  • GenMedGPT dataset

Fine-tuned using LoRA for 1 epoch, with rank=64, alpha=16

Usage:

Load using vLLM as follows:

from vllm import LLM, SamplingParams

llm = LLM(model="jiviadmin/biomistral-ft-10k")

sampling_params = SamplingParams(max_tokens=1, # set it same as max_seq_length in SFT Trainer,
temperature=0.1,
skip_special_tokens=True,
repetition_penalty=1.5)

input_data = <YOUR-INPUT-PROMPTS-AS-A-LIST>
prompts = []
outputs_ls = []

TEMPLATE = """{}""" # The prompt is same as training one, just without output part, you can add special tokens like [INST] if needed

def add_prompt(sample):
    prompt = TEMPLATE.format(sample)
    return prompt

for sample in input_data:
    text = add_prompt(sample)
    prompts.append(text)

outputs = llm.generate(prompts, sampling_params) # Batch inference

for output in outputs:
    generated_text = output.outputs[0].text
    outputs_ls.append(generated_text.strip())

Benchmarks:

Model Prompt Type Temp Repetition Penalty Overall Accuracy Pubmed MedQA MedMCQA Pubmed questions count MedQA questions count MedMCQA questions count
Biomistral - FT 10K No RAG 0.1 1.5 44.19% 51.36% 38.29% 43.58% 847 935 888
Biomistral - FT 10K RAG : Highest scoring chunk selected 0.1 1.5 82.08% 95.89% 67.68% 82.48% 998 984 959
Biomistral - FT 10K RAG : Reranker (BGE V2 M3) used to select chunk 0.1 1.5 86.44% 97.50% 73.28% 88.47% 999 988 971
Downloads last month
13
Safetensors
Model size
7.24B params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train jiviai/biomistral-ft-10k

Spaces using jiviai/biomistral-ft-10k 2