mongodb-gemini-rag / README.md
Pash1986's picture
Update README.md
0df4481 verified
|
raw
history blame
4.23 kB
---
title: Mongodb Gemini Rag
emoji: ♊️
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.31.5
app_file: app.py
pinned: false
license: apache-2.0
---
# Atlas Vector Search Chat with MongoDB and Google Gemini
Welcome to the Atlas Vector Search Chat! This application demonstrates how to use MongoDB [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/) with [Google Gemini](https://ai.google.dev/) for semantic search and retrieval tasks.
## Features
- **Interactive Chat**: Ask questions related to the embedded documents.
- **Vector Search**: Utilizes MongoDB Atlas Vector Search to find relevant documents based on similarity.
- **Google Gemini Integration**: Embeds text and generates responses.
## Requirements
- Python 3.7 or later
- MongoDB Atlas account
- Atlas cluster enabled with `0.0.0.0/0` connection and connetion string
- Google Cloud account with access to Gemini
## Installation
1. **Clone the space**:
- Click [...] and clone the space to your repo, make sure to input the variables:
3. **Set up environment variables**:
- `GOOGLE_API_KEY`: Your Google API key for Gemini.
- `MONGODB_ATLAS_URI`: Your MongoDB Atlas connection string.
## Running the Application
1. **Start the application**:
```bash
python app.py
```
2. **Access the interface**:
Open your browser on the `App` tab.
## Vector Search Index Configuration
To create a [vector search index](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-index/) on the `google-ai.embedded_docs` collection, use the following configuration:
```
{
"fields": [
{
"numDimensions": 768,
"path": "embedding",
"similarity": "cosine",
"type": "vector"
}
]
}
```
## MongoDB Trigger to Embed Results
This Atlas enviroment use an Atlas Database trigger on collection `google-ai.embedded_docs` to capture any `insert` operation and embed the content as specified in this [article](https://www.mongodb.com/developer/products/atlas/semantic-search-mongodb-atlas-vector-search/).
```
// Get the API key from Realm's Values & Secrets
const apiKey = context.values.get('google-api-key');
// Set up the URL for the Google Generative Language API - embedding endpoint
const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent?key=${apiKey}`;
// batch example
// const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:batchEmbedContents?key=${apiKey}`;
// Get the full document from the change event
const doc = changeEvent.fullDocument;
try {
console.log(`Processing document with id: ${doc._id}`);
// Prepare the request body
const requestBody = `{
"model": "models/embedding-001",
"content": {
"parts":[{
"text": '${doc.content}'}]}}`;
// Make the HTTP POST request
const response = await context.http.post({
url: url,
headers: { 'Content-Type': ['application/json'] },
body: requestBody
});
// Parse the JSON response
const responseData = EJSON.parse(response.body.text());
console.log(JSON.stringify(responseData))
if(response.statusCode === 200) {
console.log("Successfully received embedding response from the API.");
// Extract the embedding from the response
const embedding = responseData.embedding.values; // Adjust based on actual response structure
// Use the name of your MongoDB Atlas Cluster
const collection = context.services.get("mongodb-atlas").db("google-ai").collection("embedded_docs");
// Update the document in MongoDB with the embedding
const updateResult = await collection.updateOne(
{ _id: doc._id },
{ $set: { embedding: embedding }}
);
if(updateResult.modifiedCount === 1) {
console.log("Successfully updated the document.");
} else {
console.log("Failed to update the document.");
}
} else {
console.log(`Failed to receive embedding. Status code: ${response.statusCode} - ${JSON.stringify(response)}`);
}
} catch(err) {
console.error(`Error making request to API: ${err}`);
}
```