Instructions to use AnhadMahajan/agrivision-blip2-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AnhadMahajan/agrivision-blip2-model with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="AnhadMahajan/agrivision-blip2-model")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("AnhadMahajan/agrivision-blip2-model", dtype="auto") - Notebooks
- Google Colab
- Kaggle
AgriVision BLIP2
AgriVision BLIP2 is a fine-tuned multimodal Vision-Language Model developed for intelligent crop disease diagnosis from plant leaf images.
Built on top of Salesforce BLIP2 OPT 2.7B and adapted using LoRA (Low-Rank Adaptation), the model generates structured agricultural diagnosis reports containing disease identification, symptom interpretation, pathogen categorization, and agronomic recommendations.
Unlike traditional image classification models that only predict disease labels, AgriVision BLIP2 is designed to provide explainable and human-readable diagnostic outputs.
Model Overview
Objective
The primary objective of this project is to transform a general-purpose Vision-Language Model into a domain-specialized agricultural assistant capable of understanding crop diseases and generating structured diagnostic reports.
The model performs multimodal reasoning by jointly analyzing visual leaf patterns and generating natural language agricultural explanations.
Key Capabilities
The model can generate:
- Crop disease identification
- Symptom descriptions
- Pathogen category analysis
- Structured agricultural diagnosis reports
- Clinical-style disease explanations
Example output:
This is a diseased Maize leaf. Disease identified: Northern Leaf Blight. Visible symptoms include canoe-shaped lesions with gray-green margins that turn tan with dark fungal sporulation, starting on lower leaves and spreading upwards. Pathogen category: Fungal.
Model Details
| Attribute | Value |
|---|---|
| Model Name | AgriVision BLIP2 |
| Base Model | Salesforce/blip2-opt-2.7b |
| Architecture | Vision-Language Model (VLM) |
| Fine-Tuning Method | LoRA |
| Framework | Transformers + PEFT + PyTorch |
| Domain | Agricultural Disease Diagnosis |
| Language | English |
| License | Apache 2.0 |
Dataset
LeafNet
This model was fine-tuned using the LeafNet dataset.
Dataset:
https://huggingface.co/datasets/enalis/LeafNet
The dataset contains:
- Multiple crop species
- Healthy and diseased leaves
- Fungal diseases
- Bacterial diseases
- Pest and insect damage
- Symptom-rich annotations
Training samples were converted into multiple instruction-style agricultural reasoning tasks.
Training Strategy
Fine-Tuning Approach
The model was adapted using:
- LoRA (Low-Rank Adaptation)
- 8-bit Quantization
- Mixed Precision Training
- Multimodal Agricultural Instruction Tuning
This approach enabled efficient domain adaptation while preserving the general multimodal capabilities of BLIP2.
Training Tasks
The model was trained on multiple agricultural reasoning tasks:
Disease Identification
Example:
Input: Plant leaf image
Output:
This is a Potato leaf affected by Early blight.
Symptom Explanation
Example:
Input: Plant leaf image
Output:
Small, dark, papery flecks growing into brown-black circular lesions.
Pathogen Classification
Example:
Input: Plant leaf image
Output:
This plant shows a Fungal condition.
Comprehensive Agricultural Diagnosis
Example:
Input: Plant leaf image
Output:
This is a diseased Coffee leaf. Disease identified: Phoma. Visible symptoms include dark-colored zoned patches starting at the edges with small black pycnidia on the lesions. Pathogen category: Fungal.
Intended Use
Direct Use
The model is intended for:
- Agricultural disease diagnosis
- Plant pathology research
- Agritech applications
- Educational demonstrations
- AI-assisted crop monitoring systems
Downstream Applications
Potential downstream use cases include:
- Smart farming platforms
- Mobile crop diagnosis applications
- Precision agriculture systems
- Agricultural advisory tools
- Farmer support systems
Limitations
Users should be aware of the following limitations:
- Performance depends on image quality.
- The model may struggle with crop species not represented in the training data.
- Predictions should not replace professional agronomic advice.
- Generated outputs may occasionally contain hallucinated information.
- The model should be treated as a decision-support system rather than a definitive diagnosis tool.
Example Usage
from PIL import Image
from transformers import Blip2Processor, Blip2ForConditionalGeneration
processor = Blip2Processor.from_pretrained("YOUR_USERNAME/AgriVision-BLIP2")
model = Blip2ForConditionalGeneration.from_pretrained("YOUR_USERNAME/AgriVision-BLIP2")
image = Image.open("leaf.jpg")
inputs = processor(
images=image,
text="Analyze this crop leaf comprehensively.",
return_tensors="pt"
)
outputs = model.generate(
**inputs,
max_new_tokens=80
)
print(
processor.tokenizer.decode(
outputs[0],
skip_special_tokens=True
)
)
Future Work
Planned improvements include:
- Larger agricultural datasets
- Better confidence estimation
- Disease-specific remedy generation
- Multilingual support
- Mobile deployment
- Advanced explainability techniques
- Hugging Face Spaces integration
Project Components
The complete AgriVision ecosystem includes:
- Fine-tuned BLIP2 model
- Structured diagnosis engine
- Symptom extraction pipeline
- Remedy recommendation layer
- Explainability heatmaps
- Interactive Gradio application
Acknowledgements
This work builds upon:
- Salesforce BLIP2
- Hugging Face Transformers
- PEFT (Parameter-Efficient Fine-Tuning)
- LeafNet Dataset
- PyTorch
Author
Anhad Mahajan
Computer Science Engineering (Artificial Intelligence)
Interests:
- Multimodal AI
- Computer Vision
- Generative AI
- MLOps
- Agricultural AI
GitHub: https://github.com/AnhadMahajan
LinkedIn: https://www.linkedin.com/in/anhadmahajan/
Citation
If you use this model in research or applications, please cite:
@misc{mahajan2026agrivision,
title={AgriVision BLIP2: Intelligent Crop Disease Diagnosis using Vision-Language Models},
author={Anhad Mahajan},
year={2026},
publisher={Hugging Face}
}
Model tree for AnhadMahajan/agrivision-blip2-model
Base model
Salesforce/blip2-opt-2.7b