Elevate Responses: RAG with LlamaIndex & MongoDB

Community Article Published March 28, 2024

image/png

Introduction

Retrieval Augmented Generation (RAG) systems have revolutionized the way we interact with large language models (LLMs) by enhancing their capabilities to provide contextually relevant responses. These systems connect LLMs to databases, enabling them to retrieve semantically relevant information to augment their responses. In this comprehensive guide, we'll explore how to construct your own RAG system using LlamaIndex and MongoDB, empowering you to develop dynamic and context-aware applications.

image/png

Definitions

Retrieval Augmented Generation (RAG): A system design pattern that integrates information retrieval techniques with generative AI models, enhancing the relevance and accuracy of responses to user queries by supplementing them with additional context retrieved from external data sources.

LLamaIndex: An LLM/data framework that facilitates the connection of data sources to both proprietary and open-source LLMs, abstracting complexities associated with data ingestion and RAG pipeline implementation.

Benefits for Integration

Integrating LlamaIndex with MongoDB offers several advantages:

Efficient Data Retrieval: MongoDB serves as both an operational and vector database, efficiently storing and retrieving vector embeddings and operational data required for RAG systems.

Scalability: LlamaIndex abstracts complexities associated with data ingestion and RAG pipeline implementation, enabling developers to build scalable applications that adapt to various domains swiftly.

Contextual Relevance: By leveraging MongoDB's indexing capabilities and LlamaIndex's retrieval model, developers can ensure that LLM responses are contextually relevant and accurate.

Code Implementation

Let's delve into the practical implementation steps:

image/png

Step I: Install Libraries

!pip install llama-index
!pip install llama-index-vector-stores-mongodb
!pip install llama-index-embeddings-openai
!pip install pymongo
!pip install datasets
!pip install pandas

Step 2: OPENAI KEY SETUP

import os
os.environ["OPENAI_API_KEY"] = ""

Step 3: Data load and Processing

from datasets import load_dataset
import pandas as pd

# https://huggingface.co/datasets/MongoDB/embedded_movies
# Make sure you have an Hugging Face token(HF_TOKEN) in your development environemnt
dataset = load_dataset("MongoDB/airbnb_embeddings")

# Convert the dataset to a pandas dataframe
dataset_df = pd.DataFrame(dataset['train'])


## Processing
dataset_df = dataset_df.drop(columns=['text_embeddings'])

dataset_df.head(5)

Step 4: LLAMAINDEX LLM CONFIGURATION

from llama_index.core.settings import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

embed_model = OpenAIEmbedding(model="text-embedding-3-small", dimensions=256)
llm = OpenAI()

Settings.llm = llm
Settings.embed_model = embed_model

Step 5: CREATING LLAMAINDEX CUSTOM DOCUMENTS AND NODES

import json
from llama_index.core import Document
from llama_index.core.schema import MetadataMode

# Convert the DataFrame to a JSON string representation
documents_json = dataset_df.to_json(orient='records')

# Load the JSON string into a Python list of dictionaries
documents_list = json.loads(documents_json)

llama_documents = []

for document in documents_list:

  # Value for metadata must be one of (str, int, float, None)
  document["amenities"] = json.dumps(document["amenities"])
  document["images"] = json.dumps(document["images"])
  document["host"] = json.dumps(document["host"])
  document["address"] = json.dumps(document["address"])
  document["availability"] = json.dumps(document["availability"])
  document["review_scores"] = json.dumps(document["review_scores"])
  document["reviews"] = json.dumps(document["reviews"])
  document["image_embeddings"] = json.dumps(document["image_embeddings"])


  # Create a Document object with the text and excluded metadata for llm and embedding models
  llama_document = Document(
      text=document["description"],
      metadata=document,
      excluded_llm_metadata_keys=["_id", "transit", "minimum_nights", "maximum_nights", "cancellation_policy", "last_scraped", "calendar_last_scraped", "first_review", "last_review", "security_deposit", "cleaning_fee", "guests_included", "host", "availability", "reviews", "image_embeddings"],
      excluded_embed_metadata_keys=["_id", "transit", "minimum_nights", "maximum_nights", "cancellation_policy", "last_scraped", "calendar_last_scraped", "first_review", "last_review", "security_deposit", "cleaning_fee", "guests_included", "host", "availability", "reviews", "image_embeddings"],
      metadata_template="{key}=>{value}",
      text_template="Metadata: {metadata_str}\n-----\nContent: {content}",
      )

  llama_documents.append(llama_document)

# Observing an example of what the LLM and Embedding model receive as input
print(
    "\nThe LLM sees this: \n",
    llama_documents[0].get_content(metadata_mode=MetadataMode.LLM),
)
print(
    "\nThe Embedding model sees this: \n",
    llama_documents[0].get_content(metadata_mode=MetadataMode.EMBED),
)

Step 6: Create Nodes

from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.schema import MetadataMode

parser = SentenceSplitter(chunk_size=5000)
nodes = parser.get_nodes_from_documents(llama_documents)

for node in nodes:
    node_embedding = embed_model.get_text_embedding(
        node.get_content(metadata_mode=MetadataMode.EMBED)
    )
    node.embedding = node_embedding

Step 7: MONGODB VECTOR DATABASE CONNECTION AND SETUP

Creating a database and collection within MongoDB is made simple with MongoDB Atlas.

1. First, register for a MongoDB Atlas account. For existing users, sign into MongoDB Atlas.
2. Follow the instructions. Select Atlas UI as the procedure to deploy your first cluster. 
3. Create the database: `airbnb`.
4. Within the database` airbnb`, create the collection ‘listings_reviews’. 
5. Create a vector search index named vector_index for the ‘listings_reviews’ collection. This index enables the RAG application to retrieve records as additional context to supplement user queries via vector search. Below is the JSON definition of the data collection vector search index.

import pymongo
from google.colab import userdata

def get_mongo_client(mongo_uri):
  """Establish connection to the MongoDB."""
  try:
    client = pymongo.MongoClient(mongo_uri)
    print("Connection to MongoDB successful")
    return client
  except pymongo.errors.ConnectionFailure as e:
    print(f"Connection failed: {e}")
    return None

mongo_uri = userdata.get('MONGO_URI_2')
if not mongo_uri:
  print("MONGO_URI not set in environment variables")

mongo_client = get_mongo_client(mongo_uri)

DB_NAME="movies"
COLLECTION_NAME="movies_records"

db = mongo_client[DB_NAME]
collection = db[COLLECTION_NAME]

The following code guarantees that the current database collection is empty by executing the delete_many() operation on the collection.

collection.delete_many({})

Step 8: DATA INGESTION

from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch

vector_store = MongoDBAtlasVectorSearch(mongo_client, db_name=DB_NAME, collection_name=COLLECTION_NAME, index_name="vector_index")
vector_store.add(nodes)

Step 8: QUERYING THE INDEX WITH USER QUERIES

from llama_index.core import VectorStoreIndex
import pprint
from llama_index.core.response.notebook_utils import display_response

index = VectorStoreIndex.from_vector_store(vector_store)

query_engine = index.as_query_engine(similarity_top_k=3)

query = "I want to stay in a place that's warm and friendly, and not too far from resturants, can you recommend a place? Include a reason as to why you've chosen your selection"

response = query_engine.query(query)
display_response(response)
pprint.pprint(response.source_nodes)

Conclusion

By following the steps outlined in this guide, you can develop your own RAG system with LlamaIndex and MongoDB, enhancing the capabilities of your applications to provide contextually relevant and accurate responses. This integration not only streamlines the development process but also ensures scalability and efficiency in handling large datasets. Embrace the power of RAG systems to revolutionize your AI applications and deliver enhanced user experiences.

“Stay connected and support my work through various platforms:

Medium: You can read my latest articles and insights on Medium at https://medium.com/@andysingal

Paypal: Enjoyed my article? Buy me a coffee! https://paypal.me/alphasingal?country.x=US&locale.x=en_US"

Requests and questions: If you have a project in mind that you’d like me to work on or if you have any questions about the concepts I’ve explained, don’t hesitate to let me know. I’m always looking for new ideas for future Notebooks and I love helping to resolve any doubts you might have.

Resources: