Model Card for PaliGemma Dermatology Model

Model Details

Model Description

This model, based on the PaliGemma-3B architecture, has been fine-tuned for dermatology-related image and text processing tasks. The model is designed to assist in the identification of various skin conditions using a combination of image analysis and natural language processing.

Developed by: Bruce_Wayne
Model type: vision model
Finetuned from model: https://huggingface.co/google/paligemma-3b-pt-224
LoRa Adaptors used: Yes
Intended use: Medical image analysis, specifically for dermatology **

please let me know how the model works -->https://forms.gle/cBA6apSevTyiEbp46

Thank you

Uses

Direct Use

The model can be directly used for analyzing dermatology images, providing insights into potential skin conditions.

Bias, Risks, and Limitations

Skin Tone Bias: The model may have been trained on a dataset that does not adequately represent all skin tones, potentially leading to biased results. Geographic Bias: The model's performance may vary depending on the prevalence of certain conditions in different geographic regions.

How to Get Started with the Model


import torch
from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
from PIL import Image

# Load the model and processor
model_id = "brucewayne0459/paligemma_derm"
processor = AutoProcessor.from_pretrained(model_id)
model = PaliGemmaForConditionalGeneration.from_pretrained(model_id, device_map={"": 0})
model.eval()

# Load a sample image and text input
input_text = "Identify the skin condition?"
input_image_path = " Replace with your actual image path"  
input_image = Image.open(input_image_path).convert("RGB")

# Process the input
inputs = processor(text=input_text, images=input_image, return_tensors="pt", padding="longest").to("cuda" if torch.cuda.is_available() else "cpu")

# Set the maximum length for generation
max_new_tokens = 50

# Run inference
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=max_new_tokens)

# Decode the output
decoded_output = processor.decode(outputs[0], skip_special_tokens=True)
print("Model Output:", decoded_output)

Training Details

Training Data

The model was fine-tuned on a dataset of dermatological images combined with disease names

Training Procedure

The model was fine-tuned using LoRA (Low-Rank Adaptation) for more efficient training. Mixed precision (bfloat16) was used to speed up training and reduce memory usage.

Training Hyperparameters

Training regime: Mixed precision (bfloat16)
Epochs: 10
Learning rate: 2e-5
Batch size: 6
Gradient accumulation steps: 4

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a separate validation set of dermatological images and Disease Names, distinct from the training data.

Metrics

Validation Loss: The loss was tracked throughout the training process to evaluate model performance.
Accuracy: The primary metric for assessing model predictions.

Results

The model achieved a final validation loss of approximately 0.2214, indicating reasonable performance in predicting skin conditions based on the dataset used.

Summary

Environmental Impact

Hardware Type: 1 x L4 GPU
Hours used: ~22 HOURS
Cloud Provider: LIGHTNING AI
Compute Region: USA
Carbon Emitted: 0.9 kg eq. CO2

Technical Specifications

Model Architecture and Objective

Architecture: Vision-Language model based on PaliGemma-3B
Objective: To classify and diagnose dermatological conditions from images and text

brucewayne0459
/

paligemma_derm