File size: 1,620 Bytes
076f916 da1ef3e 6749e9f 076f916 be9fa37 076f916 dffee8e 076f916 d5017ec c860288 d5017ec a7a7d4c d5017ec dffee8e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
---
title: Multimodal Content Generation
emoji: 🚀
colorFrom: gray
colorTo: gray
sdk: streamlit
sdk_version: 1.32.0
app_file: multi-modal-content-generation.py
pinned: false
license: apache-2.0
short_description: A Conversational Chatbot, Image Summarizer & Text2Image.
---
## A Multimodal Content Generation have following capabilities:
## 1. A `Conversational chatbot` as same as `ChatGPT v3.5 + Image Summarization` Capabilities through `GOOGLE GEMINI VISION PRO API`.
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/63b42f2ab7fec0adf6514350/szXqpwRkzPm_e59dlf5_l.qt"></video>
<img width="1312" alt="Screenshot 2024-03-07 at 5 00 49 PM" src="https://github.com/jaiminjariwala/Multimodal-Content-Generation-using-LLMs/assets/157014747/ffa998b9-791d-446b-b951-2f36545ac014">
## 2. `Text to Image` (using Stability Ai (Stable Diffusion)) through `REPLICATE API`.
<img width="673" alt="Screenshot 2024-03-07 at 10 58 41 AM" src="https://github.com/jaiminjariwala/Multimodal-Content-Generation-using-LLMs/assets/157014747/bbfd362e-5437-4807-b58a-09e6efde06f8">
## Setup steps:
1. Create virtual environment
```
python -m venv <name of virtual environment>
```
2. Activate it
```
source <name of virtual environment>/bin/activate
```
3. Now install required libraries from requirements.txt file using...
```
pip install -r requirements.txt
```
4. Create .env file and add your API TOKEN
```
GOOGLE_API_KEY="Enter Your GOOGLE API TOKEN"
REPLICATE_API_KEY=""
```
5. To run app
```
streamlit run <name-of-app>.py
``` |