embedchain / docs /components /embedding-models.mdx
rajesh1501's picture
Upload folder using huggingface_hub
a85c9b8 verified
---
title: 🧩 Embedding models
---
## Overview
Embedchain supports several embedding models from the following providers:
<CardGroup cols={4}>
<Card title="OpenAI" href="#openai"></Card>
<Card title="GoogleAI" href="#google-ai"></Card>
<Card title="Azure OpenAI" href="#azure-openai"></Card>
<Card title="GPT4All" href="#gpt4all"></Card>
<Card title="Hugging Face" href="#hugging-face"></Card>
<Card title="Vertex AI" href="#vertex-ai"></Card>
</CardGroup>
## OpenAI
To use OpenAI embedding function, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
Once you have obtained the key, you can use it like this:
<CodeGroup>
```python main.py
import os
from embedchain import App
os.environ['OPENAI_API_KEY'] = 'xxx'
# load embedding model configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
app.add("https://en.wikipedia.org/wiki/OpenAI")
app.query("What is OpenAI?")
```
```yaml config.yaml
embedder:
provider: openai
config:
model: 'text-embedding-3-small'
```
</CodeGroup>
* OpenAI announced two new embedding models: `text-embedding-3-small` and `text-embedding-3-large`. Embedchain supports both these models. Below you can find YAML config for both:
<CodeGroup>
```yaml text-embedding-3-small.yaml
embedder:
provider: openai
config:
model: 'text-embedding-3-small'
```
```yaml text-embedding-3-large.yaml
embedder:
provider: openai
config:
model: 'text-embedding-3-large'
```
</CodeGroup>
## Google AI
To use Google AI embedding function, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey)
<CodeGroup>
```python main.py
import os
from embedchain import App
os.environ["GOOGLE_API_KEY"] = "xxx"
app = App.from_config(config_path="config.yaml")
```
```yaml config.yaml
embedder:
provider: google
config:
model: 'models/embedding-001'
task_type: "retrieval_document"
title: "Embeddings for Embedchain"
```
</CodeGroup>
<br/>
<Note>
For more details regarding the Google AI embedding model, please refer to the [Google AI documentation](https://ai.google.dev/tutorials/python_quickstart#use_embeddings).
</Note>
## Azure OpenAI
To use Azure OpenAI embedding model, you have to set some of the azure openai related environment variables as given in the code block below:
<CodeGroup>
```python main.py
import os
from embedchain import App
os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://xxx.openai.azure.com/"
os.environ["AZURE_OPENAI_API_KEY"] = "xxx"
os.environ["OPENAI_API_VERSION"] = "xxx"
app = App.from_config(config_path="config.yaml")
```
```yaml config.yaml
llm:
provider: azure_openai
config:
model: gpt-35-turbo
deployment_name: your_llm_deployment_name
temperature: 0.5
max_tokens: 1000
top_p: 1
stream: false
embedder:
provider: azure_openai
config:
model: text-embedding-ada-002
deployment_name: you_embedding_model_deployment_name
```
</CodeGroup>
You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal).
## GPT4ALL
GPT4All supports generating high quality embeddings of arbitrary length documents of text using a CPU optimized contrastively trained Sentence Transformer.
<CodeGroup>
```python main.py
from embedchain import App
# load embedding model configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```
```yaml config.yaml
llm:
provider: gpt4all
config:
model: 'orca-mini-3b-gguf2-q4_0.gguf'
temperature: 0.5
max_tokens: 1000
top_p: 1
stream: false
embedder:
provider: gpt4all
```
</CodeGroup>
## Hugging Face
Hugging Face supports generating embeddings of arbitrary length documents of text using Sentence Transformer library. Example of how to generate embeddings using hugging face is given below:
<CodeGroup>
```python main.py
from embedchain import App
# load embedding model configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```
```yaml config.yaml
llm:
provider: huggingface
config:
model: 'google/flan-t5-xxl'
temperature: 0.5
max_tokens: 1000
top_p: 0.5
stream: false
embedder:
provider: huggingface
config:
model: 'sentence-transformers/all-mpnet-base-v2'
```
</CodeGroup>
## Vertex AI
Embedchain supports Google's VertexAI embeddings model through a simple interface. You just have to pass the `model_name` in the config yaml and it would work out of the box.
<CodeGroup>
```python main.py
from embedchain import App
# load embedding model configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```
```yaml config.yaml
llm:
provider: vertexai
config:
model: 'chat-bison'
temperature: 0.5
top_p: 0.5
embedder:
provider: vertexai
config:
model: 'textembedding-gecko'
```
</CodeGroup>