sethuiyer
/

Medichat-Llama3-8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sethuiyer commited on Apr 22, 2024

Commit

e6ccaa3

·

verified ·

1 Parent(s): 5094eea

Code sample added

Files changed (1) hide show

README.md +41 -16

README.md CHANGED Viewed

@@ -7,24 +7,12 @@ library_name: transformers
 tags:
 - mergekit
 - merge
 ---
-# merge
-This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
-## Merge Details
-### Merge Method
-This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [Locutusque/llama-3-neural-chat-v1-8b](https://huggingface.co/Locutusque/llama-3-neural-chat-v1-8b) as a base.
-### Models Merged
-The following models were included in the merge:
-* [Undi95/Llama-3-Unholy-8B](https://huggingface.co/Undi95/Llama-3-Unholy-8B)
-* [ruslanmv/Medical-Llama3-8B-16bit](https://huggingface.co/ruslanmv/Medical-Llama3-8B-16bit)
-### Configuration
 The following YAML configuration was used to produce this model:
@@ -47,3 +35,40 @@ parameters:
 dtype: bfloat16
 ```

 tags:
 - mergekit
 - merge
+license: llama2
+language:
+- en
 ---
+### Medichat-Llama3-8B
 The following YAML configuration was used to produce this model:
 dtype: bfloat16
 ```
+### Usage:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+# Load tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained("sethuiyer/Medichat-Llama3-8B")
+model = AutoModelForCausalLM.from_pretrained("sethuiyer/Medichat-Llama3-8B").to("cuda")
+# Function to format and generate response with prompt engineering using a chat template
+def askme(question):
+    sys_message = '''
+    You are an AI Medical Assistant trained on a vast dataset of health information. Please be thorough and
+    provide an informative answer. If you don't know the answer to a specific medical inquiry, advise seeking professional help.
+    '''
+    # Create messages structured for the chat template
+    messages = [{"role": "system", "content": sys_message}, {"role": "user", "content": question}]
+    # Applying chat template
+    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+    outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True)  # Adjust max_new_tokens for longer responses
+    # Extract and return the generated text
+    answer = tokenizer.batch_decode(outputs)[0].strip()
+    return answer
+# Example usage
+question = '''
+Symptoms:
+Dizziness, headache and nausea.
+What is the differnetial diagnosis?
+'''
+print(askme(question))
+```