File size: 1,040 Bytes
f5ef52c
 
 
 
5215cf8
 
 
 
6b670b2
5215cf8
c50a8b1
5215cf8
 
 
cb0a298
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f5ef52c
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Cobalt: Hybrid Search

Cobalt is a demo app for hybrid search with vector and surface search using [Ruri](https://huggingface.co/cl-nagoya/ruri-large), [BM25](https://github.com/dorianbrown/rank_bm25) and [Voyager](https://spotify.github.io/voyager/). The name cobalt is derived from the word ็‘ ็’ƒ (Ruri), which refers to cobalt glass.

## Demo
This demo app is made by Gradio.

```bash
docker compose up --build
```
and, access to http://localhost:7860/

![](./materials/cobalt-gradio-demo.png)

## Usage

```python
import pandas as pd
from model.search.hybrid import HybridSearchClient

# load documents from CSV file.
df = pd.read_csv("corpus.csv")

# Initialize search client
# Specify column name to be searched. e.g. "content"
search_client = HybridSearchClient.from_dataframe(df, "content")

# Search documents from a query
results = search_client.search_top_n("Arashi's history")

```

## Requirements

- Python 3.10
- rank_bm25
- huggingface
- voyager
- Other Python packages are refer to [requirements.txt](./requirements.txt)