File size: 5,101 Bytes
a85c9b8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
---
title: 🧩 Embedding models
---

## Overview

Embedchain supports several embedding models from the following providers:

<CardGroup cols={4}>
  <Card title="OpenAI" href="#openai"></Card>
  <Card title="GoogleAI" href="#google-ai"></Card>
  <Card title="Azure OpenAI" href="#azure-openai"></Card>
  <Card title="GPT4All" href="#gpt4all"></Card>
  <Card title="Hugging Face" href="#hugging-face"></Card>
  <Card title="Vertex AI" href="#vertex-ai"></Card>
</CardGroup>

## OpenAI

To use OpenAI embedding function, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).

Once you have obtained the key, you can use it like this:

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ['OPENAI_API_KEY'] = 'xxx'

# load embedding model configuration from config.yaml file
app = App.from_config(config_path="config.yaml")

app.add("https://en.wikipedia.org/wiki/OpenAI")
app.query("What is OpenAI?")
```

```yaml config.yaml
embedder:
  provider: openai
  config:
    model: 'text-embedding-3-small'
```

</CodeGroup>

* OpenAI announced two new embedding models: `text-embedding-3-small` and `text-embedding-3-large`. Embedchain supports both these models. Below you can find YAML config for both:

<CodeGroup>

```yaml text-embedding-3-small.yaml
embedder:
  provider: openai
  config:
    model: 'text-embedding-3-small'
```

```yaml text-embedding-3-large.yaml
embedder:
  provider: openai
  config:
    model: 'text-embedding-3-large'
```

</CodeGroup>

## Google AI

To use Google AI embedding function, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)

<CodeGroup>
```python main.py
import os
from embedchain import App

os.environ["GOOGLE_API_KEY"] = "xxx"

app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
embedder:
  provider: google
  config:
    model: 'models/embedding-001'
    task_type: "retrieval_document"
    title: "Embeddings for Embedchain"
```
</CodeGroup>
<br/>
<Note>
For more details regarding the Google AI embedding model, please refer to the [Google AI documentation](https://ai.google.dev/tutorials/python_quickstart#use_embeddings).
</Note>

## Azure OpenAI

To use Azure OpenAI embedding model, you have to set some of the azure openai related environment variables as given in the code block below:

<CodeGroup>

```python main.py
import os
from embedchain import App

os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://xxx.openai.azure.com/"
os.environ["AZURE_OPENAI_API_KEY"] = "xxx"
os.environ["OPENAI_API_VERSION"] = "xxx"

app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: azure_openai
  config:
    model: gpt-35-turbo
    deployment_name: your_llm_deployment_name
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
    stream: false

embedder:
  provider: azure_openai
  config:
    model: text-embedding-ada-002
    deployment_name: you_embedding_model_deployment_name
```
</CodeGroup>

You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).

## GPT4ALL

GPT4All supports generating high quality embeddings of arbitrary length documents of text using a CPU optimized contrastively trained Sentence Transformer.

<CodeGroup>

```python main.py
from embedchain import App

# load embedding model configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: gpt4all
  config:
    model: 'orca-mini-3b-gguf2-q4_0.gguf'
    temperature: 0.5
    max_tokens: 1000
    top_p: 1
    stream: false

embedder:
  provider: gpt4all
```

</CodeGroup>

## Hugging Face

Hugging Face supports generating embeddings of arbitrary length documents of text using Sentence Transformer library. Example of how to generate embeddings using hugging face is given below:

<CodeGroup>

```python main.py
from embedchain import App

# load embedding model configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: huggingface
  config:
    model: 'google/flan-t5-xxl'
    temperature: 0.5
    max_tokens: 1000
    top_p: 0.5
    stream: false

embedder:
  provider: huggingface
  config:
    model: 'sentence-transformers/all-mpnet-base-v2'
```

</CodeGroup>

## Vertex AI

Embedchain supports Google's VertexAI embeddings model through a simple interface. You just have to pass the `model_name` in the config yaml and it would work out of the box.

<CodeGroup>

```python main.py
from embedchain import App

# load embedding model configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```

```yaml config.yaml
llm:
  provider: vertexai
  config:
    model: 'chat-bison'
    temperature: 0.5
    top_p: 0.5

embedder:
  provider: vertexai
  config:
    model: 'textembedding-gecko'
```

</CodeGroup>