Rishi1708 commited on
Commit
f8629fc
·
verified ·
1 Parent(s): c6478f5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -19
README.md CHANGED
@@ -39,25 +39,41 @@ The model was fine-tuned using the Hugging Face Transformers library, leveraging
39
  To use this model for generating captions for medical images, you can load it using the Hugging Face Transformers library. Below is an example of how to load the model and generate a caption for an image:
40
 
41
  ```python
42
- from transformers import AutoModel, AutoTokenizer
43
- from PIL import Image
44
  import torch
45
-
46
- # Load the model and tokenizer
47
- model_name = "Rishi1708/LLaMA3.2-11B-VisionInstruct-MedicalXray-ScanAnalysis-v1.0"
48
- model = AutoModel.from_pretrained(model_name)
49
- tokenizer = AutoTokenizer.from_pretrained(model_name)
50
-
51
- # Load and preprocess the image
52
- image = Image.open("path/to/medical_image.jpg")
53
- # Preprocess the image according to the model's requirements
54
-
55
- # Generate caption
56
- # Note: The exact input format may vary; refer to the model's documentation
57
- inputs = tokenizer(image, return_tensors="pt")
58
- outputs = model.generate(**inputs)
59
- caption = tokenizer.decode(outputs[0], skip_special_tokens=True)
60
- print(caption)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  ```
62
 
63
  **Note:** The exact method to prepare inputs and generate outputs may depend on the specific model architecture. Please refer to the base model's documentation for detailed usage instructions.
@@ -66,10 +82,12 @@ print(caption)
66
  - `transformers`
67
  - `torch`
68
  - `Pillow` (for image handling)
 
 
69
 
70
  Install these using:
71
  ```bash
72
- pip install transformers torch Pillow
73
  ```
74
 
75
  ## Evaluation
 
39
  To use this model for generating captions for medical images, you can load it using the Hugging Face Transformers library. Below is an example of how to load the model and generate a caption for an image:
40
 
41
  ```python
42
+ from transformers import AutoProcessor, AutoModelForVision2Seq, BitsAndBytesConfig
 
43
  import torch
44
+ from PIL import Image
45
+ # Configure 4-bit quantization to reduce memory usage
46
+ quantization_config = BitsAndBytesConfig(load_in_4bit=True)
47
+ # Load the processor and model from Hugging Face
48
+ processor = AutoProcessor.from_pretrained("Rishi1708/LLaMA3.2-11B-VisionInstruct-MedicalXray-ScanAnalysis-v1.0")
49
+ model = AutoModelForVision2Seq.from_pretrained(
50
+ "Rishi1708/LLaMA3.2-11B-VisionInstruct-MedicalXray-ScanAnalysis-v1.0",
51
+ quantization_config=quantization_config,
52
+ device_map="auto"
53
+ )
54
+ # Prepare the input
55
+ # Load an X-ray image from a local path (replace with your image path)
56
+ image_path = "xray.jpg"
57
+ image = Image.open(image_path).convert("RGB")
58
+ # Define the text prompt
59
+ prompt = "Analyze this X-ray image and describe any abnormalities."
60
+ # Process the inputs (text and image) into a format the model expects
61
+ inputs = processor(text=prompt, images=image, return_tensors="pt").to("cuda")
62
+ # Generate the output
63
+ with torch.no_grad():
64
+ outputs = model.generate(
65
+ input_ids=inputs["input_ids"],
66
+ pixel_values=inputs["pixel_values"],
67
+ attention_mask=inputs["attention_mask"],
68
+ aspect_ratio_ids=inputs["aspect_ratio_ids"],
69
+ aspect_ratio_mask=inputs["aspect_ratio_mask"],
70
+ max_new_tokens=100,
71
+ do_sample=False # Set to True for sampling-based generation if needed
72
+ )
73
+ # Decode the generated output into readable text
74
+ generated_text = processor.decode(outputs[0], skip_special_tokens=True)
75
+ # Print the result
76
+ print("Model Output:", generated_text)
77
  ```
78
 
79
  **Note:** The exact method to prepare inputs and generate outputs may depend on the specific model architecture. Please refer to the base model's documentation for detailed usage instructions.
 
82
  - `transformers`
83
  - `torch`
84
  - `Pillow` (for image handling)
85
+ - `bitsandbytes`
86
+ - `accelerate`
87
 
88
  Install these using:
89
  ```bash
90
+ pip install transformers torch Pillow bitsandbytes accelerate
91
  ```
92
 
93
  ## Evaluation