MasonSpace / README.md
MasonCrinr's picture
Upload 5 files
ce8fc87

Gradio App: Image to Story Generator

This Gradio app allows you to upload an image, and it will generate a short story based on the image's content using image captioning. The generated story is then converted to audio using text-to-speech technology. You can both see the generated story and listen to it.

Demo

  • Launching the application 01

  • Select an image and Upload 02

  • Image beach (1)

  • Download the audio story

https://github.com/SartajBhuvaji/Image-to-Story-Generator/assets/31826483/1fe00f34-9716-4047-9b57-a7794524816a

Features

  • Upload an image.
  • Generate a story based on the content of the image.
  • Listen to the generated story as an audio file.

Usage

  1. Clone this repository to your local machine.
git clone https://github.com/SartajBhuvaji/Image-to-Story-Generator.git

pip install -r requirements.txt

python app.py

Create a .env file and paste your HUGGINGFACE, OPEN AI API Keys (Check the dummy_env file)

Open your web browser and navigate to http://localhost:7860 to access the app.

Upload an image to the app and click "Generate Story." You will see the generated story and be able to listen to it as audio.

Tech

  • HuggingFace
  • Image to Caption model
  • Chat GPT 3.5 LLM
  • Text-to-speech