---
license: mit
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- medical
- clinical
---
# Model Card for Model ID

## Model Details

### Model Description

This model is a fine-tuned version of [TheBloke/samantha-falcon-7B-GPTQ](https://huggingface.co/TheBloke/samantha-falcon-7B-GPTQ) for text generation tasks in the medical domain.

- **Developed by:** Pradhaph
- **Model type:** Fine-tuned samantha-falcon-7B-GPTQ based model
- **Language(s) (NLP):** English
- **License:** MIT

### Model Sources

- **Repository:** [👉Click here👈](https://huggingface.co/pradhaph/medical-falcon-7b)
- **Demo:** Available soon

## Uses

### Direct Use

This model can be used for text generation tasks in the medical domain, such as generating medical reports, answering medical queries, etc.

### Downstream Use

This model can be fine-tuned for specific medical text generation tasks or integrated into larger healthcare systems.

### Out-of-Scope Use

This model may not perform well on tasks outside the medical domain.

## Bias, Risks, and Limitations

This model will requires more than 7.00GB GPU vram and 12.00GB CPU ram 

## How to Get Started with the Model

```python
# Install dependencies
!pip install transformers==4.31.0 sentence_transformers==2.2.2
from transformers import AutoModelForCausalLM, AutoTokenizer

# 1. Load the model
loaded_model_path = r"path_to_downloaded_model"
model = AutoModelForCausalLM.from_pretrained(loaded_model_path)

# 2. Initialize the tokenizer
tokenizer = AutoTokenizer.from_pretrained(loaded_model_path)

# 3. Prepare input
context = "The context you want to provide to the model."
question = "The question you want to ask the model."
input_text = f"{context}\nQuestion: {question}\n"

# 4. Tokenize input
inputs = tokenizer(input_text, return_tensors="pt")

# 5. Model inference
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_length=512,  # Adjust max_length as per your need
        temperature=0.7,  # Adjust temperature for randomness in sampling
        top_p=0.9,  # Adjust top_p for nucleus sampling
        num_return_sequences=1  # Number of sequences to generate
    )

# 6. Decode and print the output
generated_texts = [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]
print("Generated Texts:")
for text in generated_texts:
    print(text)