RupamG commited on
Commit
73eadcb
·
verified ·
1 Parent(s): 342dc5c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -79
README.md CHANGED
@@ -1,80 +1,11 @@
1
- # 📸 Image Caption Generator
2
-
3
- ![Python](https://img.shields.io/badge/Python-3.11.9-blue)
4
- ![TensorFlow](https://img.shields.io/badge/TensorFlow-2.10-orange)
5
- ![Gradio](https://img.shields.io/badge/Gradio-Deployed-green)
6
-
7
- A Generative AI project that automatically describes the content of an image using Deep Learning. It combines **Computer Vision (InceptionV3)** and **Natural Language Processing (LSTM)** to generate accurate, human-like captions.
8
-
9
- ## 🚀 Live Demo
10
- The model is deployed and running live! You can test it with your own images here:
11
- **[👉 Click here to try the Live App on Hugging Face](https://huggingface.co/spaces/RupamG/Image_Captioning_System)**
12
-
13
  ---
14
-
15
- ## 🧠 Technical Architecture
16
- This project uses an **Encoder-Decoder** architecture:
17
-
18
- 1. **Image Encoder (InceptionV3):**
19
- * We use a pre-trained InceptionV3 model (trained on ImageNet) to extract high-level visual features from images.
20
- * The last classification layer is removed, leaving us with a feature vector of shape `(2048,)`.
21
- 2. **Sequence Decoder (LSTM):**
22
- * The extracted image features are passed to an LSTM (Long Short-Term Memory) network.
23
- * The LSTM learns to generate a sequence of words (caption) based on the image features and the previous words generated.
24
-
25
- **Model Pipeline:**
26
- `Input Image` ➡️ `InceptionV3` ➡️ `Feature Vector` ➡️ `LSTM` ➡️ `Predicted Caption`
27
-
28
- ---
29
-
30
- ## 📂 Dataset
31
- The model was trained on the **Flickr8k Dataset**, which consists of:
32
- * **8,000 images** (6,000 training, 1,000 val, 1,000 test).
33
- * **5 captions per image** (Total 40,000 captions).
34
-
35
- *> **Note:** Due to size constraints, the raw dataset is not included in this repository. You can download it from [Kaggle](https://www.kaggle.com/adityajn105/flickr8k) and place it in the `src/` folder.*
36
-
37
- ---
38
-
39
- ## 🛠️ Installation & Setup
40
- To run this project locally on your machine:
41
-
42
- 1. **Clone the repository:**
43
- ```bash
44
- git clone https://github.com/Marshal-GG/Advanced-Image-Captioning-System.git
45
- cd Advanced-Image-Captioning-System
46
- ```
47
-
48
- 2. **Install dependencies:**
49
- ```bash
50
- pip install -r requirements.txt
51
- ```
52
-
53
- 3. **Download the Data:**
54
- * Download Flickr8k images and `Flickr8k.token.txt`.
55
- * Place them in the `src/` folder (or update paths in the notebook).
56
-
57
- 4. **Run the Training Notebook:**
58
- * Open `main.ipynb` to see the data preprocessing, model training, and evaluation steps.
59
-
60
- 5. **Run the App:**
61
- ```bash
62
- python app.py
63
- ```
64
-
65
- ---
66
-
67
- ## 📊 Results
68
- * **Metric:** The model effectiveness is evaluated using qualitative analysis (visual inspection).
69
- * **Sample Output:**
70
- * *Input:* Image of a 2 dogs running on grass.
71
- * ![App Screenshot](demo_screenshot.png)
72
- * *Output:* "Two dogs are playing together in the grass"
73
-
74
- ---
75
-
76
- ## 🤝 Connect
77
- If you have any questions about this project or want to discuss Generative AI, feel free to connect!
78
- * **LinkedIn:** [https://www.linkedin.com/in/rupam-g/]
79
- * **Email:** [marshalgcom@gmail.com]
80
-
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Image Captioning System
3
+ emoji: 😻
4
+ colorFrom: blue
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ sdk_version: 6.2.0
8
+ app_file: app.py
9
+ pinned: false
10
+ short_description: Built a Deep Learning model to generate descriptive captions
11
+ ---