README.md · supib4132/RAGExplo1234 at main

metadata

license: apache-2.0
title: RAG Image
sdk: gradio
emoji: 📚
colorFrom: green
colorTo: blue

title: RAG Image Captioningemoji: 📸colorFrom: bluecolorTo: greensdk: gradiosdk_version: 3.35.2app_file: app.pypinned: false RAG Image Captioning Space This Space hosts a RAG-based image captioning model that generates captions for images using CLIP (openai/clip-vit-base-patch32), T5 (t5-small), and SentenceTransformer (all-MiniLM-L6-v2). It retrieves similar captions from a FAISS index and generates a final caption using T5. Usage

Upload an image via the Gradio interface to generate a caption. Use the API (/api/predict) to integrate with web or mobile apps.

Files

app.py: Gradio interface for the Space. inference.py: Custom inference script with generate_rag_caption. requirements.txt: Dependencies. faiss_index.idx: FAISS index for retrieval. captions.json: Caption corpus.

Setup Dependencies are installed from requirements.txt. The en_core_web_sm spaCy model is downloaded automatically. pip install -r requirements.txt python -m spacy download en_core_web_sm

API Integration Send a POST request to /api/predict with a base64-encoded image: import requests import base64

api_url = "https://your-username-rag-image-captioning.hf.space/api/predict" with open("test_image.jpg", "rb") as f: image_bytes = f.read() base64_image = f"data:image/jpeg;base64,{base64.b64encode(image_bytes).decode()}" response = requests.post(api_url, json={"data": [base64_image]}) print(response.json()["data"][0])