Vector Database Benchmarks: FAISS vs Chroma vs Weaviate

This repository contains experiments benchmarking popular vector databases on multimodal embeddings generated from the Flickr8k dataset.
We focused on four key evaluation dimensions:

Latency per query
Recall@5 vs Flat (accuracy tradeoffs)
Queries per second (QPS throughput)
Ingestion scaling performance

All experiments were run on Google Colab (T4 GPU for embedding generation, CPU backend for databases).

Methodology

Dataset: 6k images and 30k captions from Flickr8k.
Embeddings: CLIP (OpenAI ViT-B/32).
Workload: Caption-to-image retrieval (cross-modal).
Baseline: FAISS Flat index used as the ground-truth for recall calculations.

Each vector database was tested under the same conditions for ingestion, search, and recall.

Results Summary

Metric	FAISS	Chroma	Weaviate
Avg Latency per Query	0.19 ms	0.76 ms	1.82 ms
Recall@5 (Flat Baseline)	1.00	0.002	0.918
QPS Throughput	1929.94	719.01	598.40
Ingestion Scaling (20k)	0.024s	2.806s	4.000s

Key Takeaways

FAISS is fastest, leveraging in-memory array ingestion and customizable indexing strategies.
Chroma offers simplicity and ease of integration but struggles at scale due to batching and internal constraints.
Weaviate provides a more feature-rich ecosystem (schema, hybrid search, persistence) but at higher ingestion and query overhead.

At the million-vector scale, speed alone will not decide your choice; engineering tradeoffs, developer productivity, and system features will.
Benchmarks tell one part of the story, your use case tells the rest.

Usage

You can reproduce these experiments using the provided notebook and Hugging Face dataset.
See full code here: rag-experiments/VectorDB-Benchmarks. Dataset used: Flickr8k (train split — 6k images, 30k captions, multimodal — images and text), CLIP Embeddings. Dataset Author: Johnathan Xie

Citation

If you find this useful, please cite this repository:

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support