Update README.md
Browse files
README.md
CHANGED
|
@@ -39,25 +39,41 @@ The model was fine-tuned using the Hugging Face Transformers library, leveraging
|
|
| 39 |
To use this model for generating captions for medical images, you can load it using the Hugging Face Transformers library. Below is an example of how to load the model and generate a caption for an image:
|
| 40 |
|
| 41 |
```python
|
| 42 |
-
from transformers import
|
| 43 |
-
from PIL import Image
|
| 44 |
import torch
|
| 45 |
-
|
| 46 |
-
#
|
| 47 |
-
|
| 48 |
-
model
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
#
|
| 56 |
-
#
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
```
|
| 62 |
|
| 63 |
**Note:** The exact method to prepare inputs and generate outputs may depend on the specific model architecture. Please refer to the base model's documentation for detailed usage instructions.
|
|
@@ -66,10 +82,12 @@ print(caption)
|
|
| 66 |
- `transformers`
|
| 67 |
- `torch`
|
| 68 |
- `Pillow` (for image handling)
|
|
|
|
|
|
|
| 69 |
|
| 70 |
Install these using:
|
| 71 |
```bash
|
| 72 |
-
pip install transformers torch Pillow
|
| 73 |
```
|
| 74 |
|
| 75 |
## Evaluation
|
|
|
|
| 39 |
To use this model for generating captions for medical images, you can load it using the Hugging Face Transformers library. Below is an example of how to load the model and generate a caption for an image:
|
| 40 |
|
| 41 |
```python
|
| 42 |
+
from transformers import AutoProcessor, AutoModelForVision2Seq, BitsAndBytesConfig
|
|
|
|
| 43 |
import torch
|
| 44 |
+
from PIL import Image
|
| 45 |
+
# Configure 4-bit quantization to reduce memory usage
|
| 46 |
+
quantization_config = BitsAndBytesConfig(load_in_4bit=True)
|
| 47 |
+
# Load the processor and model from Hugging Face
|
| 48 |
+
processor = AutoProcessor.from_pretrained("Rishi1708/LLaMA3.2-11B-VisionInstruct-MedicalXray-ScanAnalysis-v1.0")
|
| 49 |
+
model = AutoModelForVision2Seq.from_pretrained(
|
| 50 |
+
"Rishi1708/LLaMA3.2-11B-VisionInstruct-MedicalXray-ScanAnalysis-v1.0",
|
| 51 |
+
quantization_config=quantization_config,
|
| 52 |
+
device_map="auto"
|
| 53 |
+
)
|
| 54 |
+
# Prepare the input
|
| 55 |
+
# Load an X-ray image from a local path (replace with your image path)
|
| 56 |
+
image_path = "xray.jpg"
|
| 57 |
+
image = Image.open(image_path).convert("RGB")
|
| 58 |
+
# Define the text prompt
|
| 59 |
+
prompt = "Analyze this X-ray image and describe any abnormalities."
|
| 60 |
+
# Process the inputs (text and image) into a format the model expects
|
| 61 |
+
inputs = processor(text=prompt, images=image, return_tensors="pt").to("cuda")
|
| 62 |
+
# Generate the output
|
| 63 |
+
with torch.no_grad():
|
| 64 |
+
outputs = model.generate(
|
| 65 |
+
input_ids=inputs["input_ids"],
|
| 66 |
+
pixel_values=inputs["pixel_values"],
|
| 67 |
+
attention_mask=inputs["attention_mask"],
|
| 68 |
+
aspect_ratio_ids=inputs["aspect_ratio_ids"],
|
| 69 |
+
aspect_ratio_mask=inputs["aspect_ratio_mask"],
|
| 70 |
+
max_new_tokens=100,
|
| 71 |
+
do_sample=False # Set to True for sampling-based generation if needed
|
| 72 |
+
)
|
| 73 |
+
# Decode the generated output into readable text
|
| 74 |
+
generated_text = processor.decode(outputs[0], skip_special_tokens=True)
|
| 75 |
+
# Print the result
|
| 76 |
+
print("Model Output:", generated_text)
|
| 77 |
```
|
| 78 |
|
| 79 |
**Note:** The exact method to prepare inputs and generate outputs may depend on the specific model architecture. Please refer to the base model's documentation for detailed usage instructions.
|
|
|
|
| 82 |
- `transformers`
|
| 83 |
- `torch`
|
| 84 |
- `Pillow` (for image handling)
|
| 85 |
+
- `bitsandbytes`
|
| 86 |
+
- `accelerate`
|
| 87 |
|
| 88 |
Install these using:
|
| 89 |
```bash
|
| 90 |
+
pip install transformers torch Pillow bitsandbytes accelerate
|
| 91 |
```
|
| 92 |
|
| 93 |
## Evaluation
|