File size: 747 Bytes
c6745d5
7814a47
eb30bc9
c6745d5
 
 
 
 
 
 
eb30bc9
 
 
 
 
c6745d5
 
7814a47
 
d5abf48
e468f5a
eb30bc9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
---
title: SimianDB demo on wikipedia dataset 6M
emoji: 📚
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 3.23.0
app_file: app.py
pinned: false
license: mit
models:
- sentence-transformers/all-MiniLM-L6-v2
- cross-encoder/ms-marco-MiniLM-L-6-v2
datasets:
- wikipedia
---

This is a space to test the capabilities of my simple vector store database SimianDB.
The demo contains the first paragraph of the 6 million entry (6,458,670) wikipedia dataset "20220301.en"
The vectors have been compressed to 8 bits for efficient storage and the similarity calculation is done converting the vectors on-the fly to 32bits, with minor impact on speed.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference