Text Generation
Transformers
Safetensors
mistral
conversational
Inference Endpoints
text-generation-inference
Edit model card

Fine tuned over a 10K stratified sample of instruct-question-answer triplets gathered from the following sources:

  • Medical meadow flashcards
  • Medical meadow wikidocs
  • HealthcareMagic dataset
  • Medical meadow MedQA MCQs
  • Medinstruct dataset
  • Medquad dataset
  • iCliniq dataset
  • Medical meadow patient info dataset
  • GenMedGPT dataset

Fine-tuned using LoRA for 1 epoch, with rank=64, alpha=16

Usage:

Load using vLLM as follows:

from vllm import LLM, SamplingParams

llm = LLM(model="jiviadmin/biomistral-ft-10k")

sampling_params = SamplingParams(max_tokens=1, # set it same as max_seq_length in SFT Trainer,
temperature=0.1,
skip_special_tokens=True,
repetition_penalty=1.5)

input_data = <YOUR-INPUT-PROMPTS-AS-A-LIST>
prompts = []
outputs_ls = []

TEMPLATE = """{}""" # The prompt is same as training one, just without output part, you can add special tokens like [INST] if needed

def add_prompt(sample):
    prompt = TEMPLATE.format(sample)
    return prompt

for sample in input_data:
    text = add_prompt(sample)
    prompts.append(text)

outputs = llm.generate(prompts, sampling_params) # Batch inference

for output in outputs:
    generated_text = output.outputs[0].text
    outputs_ls.append(generated_text.strip())

Benchmarks:

Model Prompt Type Temp Repetition Penalty Overall Accuracy Pubmed MedQA MedMCQA Pubmed questions count MedQA questions count MedMCQA questions count
Biomistral - FT 10K No RAG 0.1 1.5 44.19% 51.36% 38.29% 43.58% 847 935 888
Biomistral - FT 10K RAG : Highest scoring chunk selected 0.1 1.5 82.08% 95.89% 67.68% 82.48% 998 984 959
Biomistral - FT 10K RAG : Reranker (BGE V2 M3) used to select chunk 0.1 1.5 86.44% 97.50% 73.28% 88.47% 999 988 971
Downloads last month
3
Safetensors
Model size
7.24B params
Tensor type
F32
·

Datasets used to train jiviai/biomistral-ft-10k