Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
A newer version of the Streamlit SDK is available:
1.51.0
metadata
title: WikiArt CLIP Search Engine
emoji: 🎨
colorFrom: purple
colorTo: pink
sdk: streamlit
app_file: app.py
app_port: 7860
pinned: true
license: mit
short_description: A search engine that utilises CLIP's zero-shot capabilities.
sdk_version: 1.50.0
🎨 CLIP-Powered WikiArt Search Engine
This project was developed as part of the MSc Data Science and AI course at Queen Mary University of London (UoL). It demonstrates a practical application of state-of-the-art vision-language models for semantic search over a large-scale image dataset.
🚀 Project Overview
The core goal of this application is to allow users to search the entire WikiArt collection (over 81,000 artworks) using natural language descriptions, like "a surreal dreamlike painting" or "a vibrant cityscape".
Key Technology: CLIP
- Model: We use the CLIP (Contrastive Language–Image Pre-training) model to generate high-dimensional vector embeddings for both text queries and images.
- Search Logic: The text query embedding is compared against a pre-computed index of over 81,000 image embeddings using cosine similarity (dot product) to find the artworks that are semantically closest to the description.
- Deployment: The application is built using Streamlit and deployed on Hugging Face Spaces, with the large 31.4 GB dataset of images hosted securely on AWS S3.
👥 Team & Acknowledgments
This project was a collaborative effort. I would like to acknowledge the contributions of the team members: