Text Generation
Transformers
Safetensors
llama
mergekit
Merge
conversational
Inference Endpoints
text-generation-inference
Edit model card

Medichat-Llama3-8B

Built upon the powerful LLaMa-3 architecture and fine-tuned on an extensive dataset of health information, this model leverages its vast medical knowledge to offer clear, comprehensive answers.

The following YAML configuration was used to produce this model:


models:
  - model: Undi95/Llama-3-Unholy-8B
    parameters:
      weight: [0.25, 0.35, 0.45, 0.35, 0.25]
      density: [0.1, 0.25, 0.5, 0.25, 0.1]
  - model: Locutusque/llama-3-neural-chat-v1-8b
  - model: ruslanmv/Medical-Llama3-8B-16bit
    parameters:
      weight: [0.55, 0.45, 0.35, 0.45, 0.55]
      density: [0.1, 0.25, 0.5, 0.25, 0.1]
merge_method: dare_ties
base_model: Locutusque/llama-3-neural-chat-v1-8b
parameters:
  int8_mask: true
dtype: bfloat16

Usage:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("sethuiyer/Medichat-Llama3-8B")
model = AutoModelForCausalLM.from_pretrained("sethuiyer/Medichat-Llama3-8B").to("cuda")

# Function to format and generate response with prompt engineering using a chat template
def askme(question):
    sys_message = ''' 
    You are an AI Medical Assistant trained on a vast dataset of health information. Please be thorough and
    provide an informative answer. If you don't know the answer to a specific medical inquiry, advise seeking professional help.
    '''

    # Create messages structured for the chat template
    messages = [{"role": "system", "content": sys_message}, {"role": "user", "content": question}]

    # Applying chat template
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True)  # Adjust max_new_tokens for longer responses

    # Extract and return the generated text
    answer = tokenizer.batch_decode(outputs)[0].strip()
    return answer

# Example usage
question = '''
Symptoms:
Dizziness, headache and nausea.

What is the differnetial diagnosis?
'''
print(askme(question))
Downloads last month
1,235
Safetensors
Model size
8.03B params
Tensor type
BF16
·

Merge of

Datasets used to train sethuiyer/Medichat-Llama3-8B