README.md · m7mdal7aj/KB-VQA-E at aee9fa8acafa5459bd457f41394c490ce304e79a

metadata

title: KB-VQA
emoji: 🔥
colorFrom: gray
colorTo: blue
sdk: streamlit
sdk_version: 1.29.0
app_file: app.py
pinned: false
license: apache-2.0

Project File Structure

KB-VQA
├── Files: Various files required for the demo such as samples images, dissertation report ..etc.
├── models
|   ├── deformable-detr-detic: DETIC Object Detection Model.
|   ├── yolov5: YOLOv5 Object Detection Model.baseline)
├── my_model
|   ├── KBVQA.py : This module is the central component for implementing the designed model architecture for the Knowledge-Based Visual Question Answering (KB-VQA) project.
|   ├── state_manager.py: Manages the user interface and session state to facilitate the Run Inference tool of the Streamlit demo app.
│   ├── LLAMA2
│   │   ├── LLAMA2_model.py: Used for loading LLaMA-2 model to be fine-tuned.
│   ├── captioner
│   │   ├── image_captioning.py: Provides functionality for generating captions for images.
|   ├── detector
│   │   ├── object_detection.py: Used to detect objects in images using object detection models.
|   ├── fine_tuner
│   │   ├── fine_tuner.py: Main Fine-Tuning Script for LLaMa-2 Chat models.
│   │   ├── fine_tuning_data_handler.py: Handles and prepares the data for fine-tuning LLaMA-2 Chat models.
│   │   ├── fine_tuning_data
│   │   │   ├──fine_tuning_data_detic.csv: Fine-tuning data prepared by the prompt engineering module using DETIC detector.
│   │   │   ├──fine_tuning_data_yolov5.csv: Fine-tuning data prepared by the prompt engineering module using YOLOv5. detector.
|   ├── results
│   │   ├── Demo_Images: Contains a pool of images used for the demo app.
│   │   ├── evaluation.py: Provides a comprehensive framework for evaluating the KB-VQA model.
│   │   ├── demo.py: Provides a comprehensive framework for visualizing and demonstrating the results of the KB-VQA evaluation.
│   │   ├── evaluation_results.xlsx : This file contains all the evaluation results based on the evaluation data.
|   ├── tabs
│   │   ├── home.py: Displays an introduction to the application with brief background along with the demo tools description.
│   │   ├── results.py: Manages the interactive Streamlit demo for visualizing model evaluation results and analysis.
│   │   ├── run_inference.py: Responsible for the 'run inference' tool to test and use the fine-tuned models.
│   │   ├── model_arch.py: Displays the model architecture and accompanying abstract and design details
│   │   ├── dataset_analysis.py: Provides tools for visualizing dataset analyses.
|   ├── utilities
│   │   ├── ui_manager.py: Manages the user interface for the Streamlit application, handling the creation and navigation of various tabs.
│   │   ├── gen_utilities.py: Provides a collection of utility functions and classes commonly used across various parts
|   ├── config (All Configurations files are kept separated and stored as ".py" for easy reading - this will change after the project submission.)
│   │   ├── kbvqa_config.py: Configuration parameters for the main KB-VQA model.
│   │   ├── LLAMA2_config.py: Configuration parameters for LLaMA-2 model.
│   │   ├── captioning_config.py : Configuration parameters for the captioning model (InstructBLIP).
│   │   ├── dataset_config.py: Configuration parameters for the dataset processing.
│   │   ├── evaluation_config.py: Configuration parameters for the KB-VQA model evaluation.
│   │   ├── fine_tuning_config.py: Configurable parameters for the fine-tuning nodule.
│   │   ├── inference_config.py: Configurable parameters for the Run Inference tool in the demo app.
├── app.py: main entry point for streamlit - first page in the streamlit app)
├── README.md (readme - this file)
├── requirements.txt: Requirements file for the whole project that includes all the requirements for running the demo app on the HuggingFace space environment.