File size: 4,856 Bytes
4bdb245
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
## Vectorstores
PrivateGPT supports [Qdrant](https://qdrant.tech/), [Chroma](https://www.trychroma.com/) and [PGVector](https://github.com/pgvector/pgvector) as vectorstore providers. Qdrant being the default.

In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma` or `postgres`.

```yaml
vectorstore:
  database: qdrant
```

### Qdrant configuration

To enable Qdrant, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`.

Qdrant settings can be configured by setting values to the `qdrant` property in the `settings.yaml` file.

The available configuration options are:
| Field        | Description |
|--------------|-------------|
| location     | If `:memory:` - use in-memory Qdrant instance. If `str` - use it as a `url` parameter.|
| url          | Either host or str of 'Optional[scheme], host, Optional[port], Optional[prefix]'. Eg. `http://localhost:6333` |
| port         | Port of the REST API interface. Default: `6333` |
| grpc_port    | Port of the gRPC interface. Default: `6334` |
| prefer_grpc  | If `true` - use gRPC interface whenever possible in custom methods. |
| https        | If `true` - use HTTPS(SSL) protocol.|
| api_key      | API key for authentication in Qdrant Cloud.|
| prefix       | If set, add `prefix` to the REST URL path. Example: `service/v1` will result in `http://localhost:6333/service/v1/{qdrant-endpoint}` for REST API.|
| timeout      | Timeout for REST and gRPC API requests. Default: 5.0 seconds for REST and unlimited for gRPC |
| host         | Host name of Qdrant service. If url and host are not set, defaults to 'localhost'.|
| path         | Persistence path for QdrantLocal. Eg. `local_data/private_gpt/qdrant`|
| force_disable_check_same_thread         | Force disable check_same_thread for QdrantLocal sqlite connection, defaults to True.|

By default Qdrant tries to connect to an instance of Qdrant server at `http://localhost:3000`.

To obtain a local setup (disk-based database) without running a Qdrant server, configure the `qdrant.path` value in settings.yaml:

```yaml
qdrant:
  path: local_data/private_gpt/qdrant
```

### Chroma configuration

To enable Chroma, set the `vectorstore.database` property in the `settings.yaml` file to `chroma` and install the `chroma` extra.

```bash
poetry install --extras chroma
```

By default `chroma` will use a disk-based database stored in local_data_path / "chroma_db" (being local_data_path defined in settings.yaml)

### PGVector
To use the PGVector store a [postgreSQL](https://www.postgresql.org/) database with the PGVector extension must be used.

To enable PGVector, set the `vectorstore.database` property in the `settings.yaml` file to `postgres` and install the `vector-stores-postgres` extra.

```bash
poetry install --extras vector-stores-postgres
```

PGVector settings can be configured by setting values to the `postgres` property in the `settings.yaml` file.

The available configuration options are:
| Field         | Description                                               |
|---------------|-----------------------------------------------------------|
| **host**      | The server hosting the Postgres database. Default is `localhost` |
| **port**      | The port on which the Postgres database is accessible. Default is `5432` |
| **database**  | The specific database to connect to. Default is `postgres` |
| **user**      | The username for database access. Default is `postgres` |
| **password**  | The password for database access. (Required)            |
| **schema_name** | The database schema to use. Default is `private_gpt`       |

For example:
```yaml
vectorstore:
  database: postgres

postgres:
  host: localhost
  port: 5432
  database: postgres
  user: postgres
  password: <PASSWORD>
  schema_name: private_gpt
```

The following table will be created in the database
```
postgres=# \d private_gpt.data_embeddings
                                      Table "private_gpt.data_embeddings"
  Column   |       Type        | Collation | Nullable |                         Default
-----------+-------------------+-----------+----------+---------------------------------------------------------
 id        | bigint            |           | not null | nextval('private_gpt.data_embeddings_id_seq'::regclass)
 text      | character varying |           | not null |
 metadata_ | json              |           |          |
 node_id   | character varying |           |          |
 embedding | vector(768)       |           |          |
Indexes:
    "data_embeddings_pkey" PRIMARY KEY, btree (id)

postgres=# 
```
The dimensions of the embeddings columns will be set based on the `embedding.embed_dim` value.  If the embedding model changes this table may need to be dropped and recreated to avoid a dimension mismatch.