|
--- |
|
title: KB-VQA |
|
emoji: π₯ |
|
colorFrom: gray |
|
colorTo: blue |
|
sdk: streamlit |
|
sdk_version: 1.29.0 |
|
app_file: app.py |
|
pinned: false |
|
license: apache-2.0 |
|
--- |
|
|
|
|
|
# Demonstration Environment |
|
|
|
The project demo app can be accessed from the developed [HF Space](https://huggingface.co/spaces/m7mdal7aj/KB-VQA), and the entire code can be accessed from [here](https://huggingface.co/spaces/m7mdal7aj/KB-VQA/tree/main). |
|
|
|
To run the demo app locally, from the root of the local code repository run `streamlit run app.py`. This will run the whole app. However, to run the `Run Inference Tool`, a GPU is required. |
|
|
|
## Project File Structure |
|
|
|
Each main python module of the project is extensively documented to guide the reader on how to use each module and its correcponding classes and functions. |
|
|
|
Below is the overall file structure of the project: |
|
|
|
<pre> |
|
KB-VQA |
|
βββ Files: Various files required for the demo such as samples images, dissertation report ..etc. |
|
βββ models |
|
β βββ deformable-detr-detic: DETIC Object Detection Model. |
|
β βββ yolov5: YOLOv5 Object Detection Model.baseline) |
|
βββ my_model |
|
β βββ KBVQA.py : This module is the central component for implementing the designed model architecture for the Knowledge-Based Visual Question Answering (KB-VQA) project. |
|
β βββ state_manager.py: Manages the user interface and session state to facilitate the Run Inference tool of the Streamlit demo app. |
|
β βββ LLAMA2 |
|
β β βββ LLAMA2_model.py: Used for loading LLaMA-2 model to be fine-tuned. |
|
β βββ captioner |
|
β β βββ image_captioning.py: Provides functionality for generating captions for images. |
|
β βββ detector |
|
β β βββ object_detection.py: Used to detect objects in images using object detection models. |
|
β βββ fine_tuner |
|
β β βββ fine_tuner.py: Main Fine-Tuning Script for LLaMa-2 Chat models. |
|
β β βββ fine_tuning_data_handler.py: Handles and prepares the data for fine-tuning LLaMA-2 Chat models. |
|
β β βββ fine_tuning_data |
|
β β β βββfine_tuning_data_detic.csv: Fine-tuning data prepared by the prompt engineering module using DETIC detector. |
|
β β β βββfine_tuning_data_yolov5.csv: Fine-tuning data prepared by the prompt engineering module using YOLOv5. detector. |
|
β βββ results |
|
β β βββ Demo_Images: Contains a pool of images used for the demo app. |
|
β β βββ evaluation.py: Provides a comprehensive framework for evaluating the KB-VQA model. |
|
β β βββ demo.py: Provides a comprehensive framework for visualizing and demonstrating the results of the KB-VQA evaluation. |
|
β β βββ evaluation_results.xlsx : This file contains all the evaluation results based on the evaluation data. |
|
β βββ tabs |
|
β β βββ home.py: Displays an introduction to the application with brief background along with the demo tools description. |
|
β β βββ results.py: Manages the interactive Streamlit demo for visualizing model evaluation results and analysis. |
|
β β βββ run_inference.py: Responsible for the 'run inference' tool to test and use the fine-tuned models. |
|
β β βββ model_arch.py: Displays the model architecture and accompanying abstract and design details |
|
β β βββ dataset_analysis.py: Provides tools for visualizing dataset analyses. |
|
β βββ utilities |
|
β β βββ ui_manager.py: Manages the user interface for the Streamlit application, handling the creation and navigation of various tabs. |
|
β β βββ gen_utilities.py: Provides a collection of utility functions and classes commonly used across various parts |
|
β βββ config (All Configurations files are kept separated and stored as ".py" for easy reading - this will change after the project submission.) |
|
β β βββ kbvqa_config.py: Configuration parameters for the main KB-VQA model. |
|
β β βββ LLAMA2_config.py: Configuration parameters for LLaMA-2 model. |
|
β β βββ captioning_config.py : Configuration parameters for the captioning model (InstructBLIP). |
|
β β βββ dataset_config.py: Configuration parameters for the dataset processing. |
|
β β βββ evaluation_config.py: Configuration parameters for the KB-VQA model evaluation. |
|
β β βββ fine_tuning_config.py: Configurable parameters for the fine-tuning nodule. |
|
β β βββ inference_config.py: Configurable parameters for the Run Inference tool in the demo app. |
|
βββ app.py: main entry point for streamlit - first page in the streamlit app) |
|
βββ README.md (readme - this file) |
|
βββ requirements.txt: Requirements file for the whole project that includes all the requirements for running the demo app on the HuggingFace space environment. |
|
</pre> |