File size: 1,620 Bytes
076f916
 
da1ef3e
6749e9f
 
076f916
 
be9fa37
076f916
 
dffee8e
076f916
d5017ec
 
 
 
c860288
d5017ec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a7a7d4c
d5017ec
 
 
 
dffee8e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
title: Multimodal Content Generation
emoji: 🚀
colorFrom: gray
colorTo: gray
sdk: streamlit
sdk_version: 1.32.0
app_file: multi-modal-content-generation.py
pinned: false
license: apache-2.0
short_description: A Conversational Chatbot, Image Summarizer & Text2Image.
---
## A Multimodal Content Generation have following capabilities:

## 1. A `Conversational chatbot` as same as `ChatGPT v3.5 + Image Summarization` Capabilities through `GOOGLE GEMINI VISION PRO API`.

<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/63b42f2ab7fec0adf6514350/szXqpwRkzPm_e59dlf5_l.qt"></video>

<img width="1312" alt="Screenshot 2024-03-07 at 5 00 49 PM" src="https://github.com/jaiminjariwala/Multimodal-Content-Generation-using-LLMs/assets/157014747/ffa998b9-791d-446b-b951-2f36545ac014">

## 2. `Text to Image` (using Stability Ai (Stable Diffusion)) through `REPLICATE API`.
<img width="673" alt="Screenshot 2024-03-07 at 10 58 41 AM" src="https://github.com/jaiminjariwala/Multimodal-Content-Generation-using-LLMs/assets/157014747/bbfd362e-5437-4807-b58a-09e6efde06f8">


## Setup steps:
1. Create virtual environment
    ```
    python -m venv <name of virtual environment>
    ```

2. Activate it
    ```
    source <name of virtual environment>/bin/activate
    ```

3. Now install required libraries from requirements.txt file using...
    ```
    pip install -r requirements.txt
    ```
4. Create .env file and add your API TOKEN
   ```
   GOOGLE_API_KEY="Enter Your GOOGLE API TOKEN"
   REPLICATE_API_KEY=""
   ```
5. To run app
    ```
    streamlit run <name-of-app>.py
    ```