Spaces:
Sleeping
Sleeping
title: Mongodb Gemini Rag | |
emoji: ♊️ | |
colorFrom: indigo | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 4.31.5 | |
app_file: app.py | |
pinned: false | |
license: apache-2.0 | |
# Atlas Vector Search Chat with MongoDB and Google Gemini | |
Welcome to the Atlas Vector Search Chat! This application demonstrates how to use MongoDB [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/) with [Google Gemini](https://ai.google.dev/) for semantic search and retrieval tasks. | |
## Features | |
- **Interactive Chat**: Ask questions related to the embedded documents. | |
- **Vector Search**: Utilizes MongoDB Atlas Vector Search to find relevant documents based on similarity. | |
- **Google Gemini Integration**: Embeds text and generates responses. | |
## Requirements | |
- Python 3.7 or later | |
- MongoDB Atlas account | |
- Atlas cluster enabled with `0.0.0.0/0` connection and connetion string | |
- Google Cloud account with access to Gemini | |
## Installation | |
1. **Clone the space**: | |
- Click [...] and clone the space to your repo, make sure to input the variables: | |
3. **Set up environment variables**: | |
- `GOOGLE_API_KEY`: Your Google API key for Gemini. | |
- `MONGODB_ATLAS_URI`: Your MongoDB Atlas connection string. | |
## Running the Application | |
1. **Start the application**: | |
```bash | |
python app.py | |
``` | |
2. **Access the interface**: | |
Open your browser on the `App` tab. | |
## Vector Search Index Configuration | |
To create a [vector search index](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-index/) on the `google-ai.embedded_docs` collection, use the following configuration: | |
``` | |
{ | |
"fields": [ | |
{ | |
"numDimensions": 768, | |
"path": "embedding", | |
"similarity": "cosine", | |
"type": "vector" | |
} | |
] | |
} | |
``` | |
## MongoDB Trigger to Embed Results | |
This Atlas enviroment use an Atlas Database trigger on collection `google-ai.embedded_docs` to capture any `insert` operation and embed the content as specified in this [article](https://www.mongodb.com/developer/products/atlas/semantic-search-mongodb-atlas-vector-search/). | |
``` | |
// Get the API key from Realm's Values & Secrets | |
const apiKey = context.values.get('google-api-key'); | |
// Set up the URL for the Google Generative Language API - embedding endpoint | |
const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent?key=${apiKey}`; | |
// batch example | |
// const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:batchEmbedContents?key=${apiKey}`; | |
// Get the full document from the change event | |
const doc = changeEvent.fullDocument; | |
try { | |
console.log(`Processing document with id: ${doc._id}`); | |
// Prepare the request body | |
const requestBody = `{ | |
"model": "models/embedding-001", | |
"content": { | |
"parts":[{ | |
"text": '${doc.content}'}]}}`; | |
// Make the HTTP POST request | |
const response = await context.http.post({ | |
url: url, | |
headers: { 'Content-Type': ['application/json'] }, | |
body: requestBody | |
}); | |
// Parse the JSON response | |
const responseData = EJSON.parse(response.body.text()); | |
console.log(JSON.stringify(responseData)) | |
if(response.statusCode === 200) { | |
console.log("Successfully received embedding response from the API."); | |
// Extract the embedding from the response | |
const embedding = responseData.embedding.values; // Adjust based on actual response structure | |
// Use the name of your MongoDB Atlas Cluster | |
const collection = context.services.get("mongodb-atlas").db("google-ai").collection("embedded_docs"); | |
// Update the document in MongoDB with the embedding | |
const updateResult = await collection.updateOne( | |
{ _id: doc._id }, | |
{ $set: { embedding: embedding }} | |
); | |
if(updateResult.modifiedCount === 1) { | |
console.log("Successfully updated the document."); | |
} else { | |
console.log("Failed to update the document."); | |
} | |
} else { | |
console.log(`Failed to receive embedding. Status code: ${response.statusCode} - ${JSON.stringify(response)}`); | |
} | |
} catch(err) { | |
console.error(`Error making request to API: ${err}`); | |
} | |
``` | |