clip-art-search / README.md
rashad-m's picture
Update README.md
bd25663 verified

A newer version of the Streamlit SDK is available: 1.51.0

Upgrade
metadata
title: WikiArt CLIP Search Engine
emoji: 🎨
colorFrom: purple
colorTo: pink
sdk: streamlit
app_file: app.py
app_port: 7860
pinned: true
license: mit
short_description: A search engine that utilises CLIP's zero-shot capabilities.
sdk_version: 1.50.0

🎨 CLIP-Powered WikiArt Search Engine

This project was developed as part of the MSc Data Science and AI course at Queen Mary University of London (UoL). It demonstrates a practical application of state-of-the-art vision-language models for semantic search over a large-scale image dataset.

🚀 Project Overview

The core goal of this application is to allow users to search the entire WikiArt collection (over 81,000 artworks) using natural language descriptions, like "a surreal dreamlike painting" or "a vibrant cityscape".

Key Technology: CLIP

  • Model: We use the CLIP (Contrastive Language–Image Pre-training) model to generate high-dimensional vector embeddings for both text queries and images.
  • Search Logic: The text query embedding is compared against a pre-computed index of over 81,000 image embeddings using cosine similarity (dot product) to find the artworks that are semantically closest to the description.
  • Deployment: The application is built using Streamlit and deployed on Hugging Face Spaces, with the large 31.4 GB dataset of images hosted securely on AWS S3.

👥 Team & Acknowledgments

This project was a collaborative effort. I would like to acknowledge the contributions of the team members: