AgriVision BLIP2

AgriVision BLIP2 is a fine-tuned multimodal Vision-Language Model developed for intelligent crop disease diagnosis from plant leaf images.

Built on top of Salesforce BLIP2 OPT 2.7B and adapted using LoRA (Low-Rank Adaptation), the model generates structured agricultural diagnosis reports containing disease identification, symptom interpretation, pathogen categorization, and agronomic recommendations.

Unlike traditional image classification models that only predict disease labels, AgriVision BLIP2 is designed to provide explainable and human-readable diagnostic outputs.


Model Overview

Objective

The primary objective of this project is to transform a general-purpose Vision-Language Model into a domain-specialized agricultural assistant capable of understanding crop diseases and generating structured diagnostic reports.

The model performs multimodal reasoning by jointly analyzing visual leaf patterns and generating natural language agricultural explanations.


Key Capabilities

The model can generate:

  • Crop disease identification
  • Symptom descriptions
  • Pathogen category analysis
  • Structured agricultural diagnosis reports
  • Clinical-style disease explanations

Example output:

This is a diseased Maize leaf. Disease identified: Northern Leaf Blight. Visible symptoms include canoe-shaped lesions with gray-green margins that turn tan with dark fungal sporulation, starting on lower leaves and spreading upwards. Pathogen category: Fungal.


Model Details

Attribute Value
Model Name AgriVision BLIP2
Base Model Salesforce/blip2-opt-2.7b
Architecture Vision-Language Model (VLM)
Fine-Tuning Method LoRA
Framework Transformers + PEFT + PyTorch
Domain Agricultural Disease Diagnosis
Language English
License Apache 2.0

Dataset

LeafNet

This model was fine-tuned using the LeafNet dataset.

Dataset:

https://huggingface.co/datasets/enalis/LeafNet

The dataset contains:

  • Multiple crop species
  • Healthy and diseased leaves
  • Fungal diseases
  • Bacterial diseases
  • Pest and insect damage
  • Symptom-rich annotations

Training samples were converted into multiple instruction-style agricultural reasoning tasks.


Training Strategy

Fine-Tuning Approach

The model was adapted using:

  • LoRA (Low-Rank Adaptation)
  • 8-bit Quantization
  • Mixed Precision Training
  • Multimodal Agricultural Instruction Tuning

This approach enabled efficient domain adaptation while preserving the general multimodal capabilities of BLIP2.


Training Tasks

The model was trained on multiple agricultural reasoning tasks:

Disease Identification

Example:

Input: Plant leaf image

Output:

This is a Potato leaf affected by Early blight.


Symptom Explanation

Example:

Input: Plant leaf image

Output:

Small, dark, papery flecks growing into brown-black circular lesions.


Pathogen Classification

Example:

Input: Plant leaf image

Output:

This plant shows a Fungal condition.


Comprehensive Agricultural Diagnosis

Example:

Input: Plant leaf image

Output:

This is a diseased Coffee leaf. Disease identified: Phoma. Visible symptoms include dark-colored zoned patches starting at the edges with small black pycnidia on the lesions. Pathogen category: Fungal.


Intended Use

Direct Use

The model is intended for:

  • Agricultural disease diagnosis
  • Plant pathology research
  • Agritech applications
  • Educational demonstrations
  • AI-assisted crop monitoring systems

Downstream Applications

Potential downstream use cases include:

  • Smart farming platforms
  • Mobile crop diagnosis applications
  • Precision agriculture systems
  • Agricultural advisory tools
  • Farmer support systems

Limitations

Users should be aware of the following limitations:

  • Performance depends on image quality.
  • The model may struggle with crop species not represented in the training data.
  • Predictions should not replace professional agronomic advice.
  • Generated outputs may occasionally contain hallucinated information.
  • The model should be treated as a decision-support system rather than a definitive diagnosis tool.

Example Usage

from PIL import Image
from transformers import Blip2Processor, Blip2ForConditionalGeneration

processor = Blip2Processor.from_pretrained("YOUR_USERNAME/AgriVision-BLIP2")
model = Blip2ForConditionalGeneration.from_pretrained("YOUR_USERNAME/AgriVision-BLIP2")

image = Image.open("leaf.jpg")

inputs = processor(
    images=image,
    text="Analyze this crop leaf comprehensively.",
    return_tensors="pt"
)

outputs = model.generate(
    **inputs,
    max_new_tokens=80
)

print(
    processor.tokenizer.decode(
        outputs[0],
        skip_special_tokens=True
    )
)

Future Work

Planned improvements include:

  • Larger agricultural datasets
  • Better confidence estimation
  • Disease-specific remedy generation
  • Multilingual support
  • Mobile deployment
  • Advanced explainability techniques
  • Hugging Face Spaces integration

Project Components

The complete AgriVision ecosystem includes:

  • Fine-tuned BLIP2 model
  • Structured diagnosis engine
  • Symptom extraction pipeline
  • Remedy recommendation layer
  • Explainability heatmaps
  • Interactive Gradio application

Acknowledgements

This work builds upon:

  • Salesforce BLIP2
  • Hugging Face Transformers
  • PEFT (Parameter-Efficient Fine-Tuning)
  • LeafNet Dataset
  • PyTorch

Author

Anhad Mahajan

Computer Science Engineering (Artificial Intelligence)

Interests:

  • Multimodal AI
  • Computer Vision
  • Generative AI
  • MLOps
  • Agricultural AI

GitHub: https://github.com/AnhadMahajan

LinkedIn: https://www.linkedin.com/in/anhadmahajan/


Citation

If you use this model in research or applications, please cite:

@misc{mahajan2026agrivision,
  title={AgriVision BLIP2: Intelligent Crop Disease Diagnosis using Vision-Language Models},
  author={Anhad Mahajan},
  year={2026},
  publisher={Hugging Face}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AnhadMahajan/agrivision-blip2-model

Adapter
(43)
this model

Dataset used to train AnhadMahajan/agrivision-blip2-model