Edit model card

abhishek-ch/biomistral-7b-synthetic-ehr

image/png

This model was converted to MLX format from BioMistral/BioMistral-7B-DARE. Refer to the original model card for more details on the model.

Use with mlx

pip install mlx-lm

The model was LoRA fine-tuned on health_facts and Synthetic EHR dataset inspired by MIMIC-IV using the format below, for 1000 steps (~1M tokens) using mlx.

def format_prompt(prompt:str, question: str) -> str:
    return """<s>[INST]
## Instructions
{}
## User Question
{}.
[/INST]</s> 
""".format(prompt, question)

Example For Synthetic EHR Diagnosis System Prompt

You are an expert in provide diagnosis summary based on clinical notes inspired by MIMIC-IV-Note dataset.
These notes encompass Chief Complaint along with Patient Summary & medical admission details.

Example for Healthfacts Check System Prompt

You are a Public Health AI Assistant. You can do the fact-checking of public health claims. \nEach answer labelled with true, false, unproven or mixture. \nPlease provide the reason behind the answer

Loading the model using mlx

from mlx_lm import generate, load
model, tokenizer = load("abhishek-ch/biomistral-7b-synthetic-ehr")
response = generate(
    fused_model,
    fused_tokenizer,
    prompt=format_prompt(prompt, question),
    verbose=True,  # Set to True to see the prompt and response
    temp=0.0,
    max_tokens=512,
)

Loading the model using transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "abhishek-ch/biomistral-7b-synthetic-ehr"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)
model.to("mps")
input_text = format_prompt(system_prompt, question)
input_ids = tokenizer(input_text, return_tensors="pt").to("mps")
outputs = model.generate(
    **input_ids,
    max_new_tokens=512,
)
print(tokenizer.decode(outputs[0]))
Downloads last month
29
Safetensors
Model size
7.24B params
Tensor type
BF16
·

Merge of

Dataset used to train abhishek-ch/biomistral-7b-synthetic-ehr