Google collaborated sample - RAG use
Hi,
this model should fit in free Google cola instance.
Any sample on how to use it, especially combined with vector store for RAG use?
Hi,
managed to create sample with chromadb, here are most important parts:
Fix for huggingface embeddings and chromadb latest version
from chromadb.utils import embedding_functions
class MyEmbeddingFunction(EmbeddingFunction[Documents]):
def call(self, input: Documents) -> Embeddings:
sentence_transformer_ef = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="intfloat/multilingual-e5-large")
embeddings = sentence_transformer_ef(input)
return embeddings
custom = MyEmbeddingFunction()
Initialize the chromadb directory, and client.
chroma_client = chromadb.PersistentClient(path="./chroma/RAG_DB")
chroma_collection = chroma_client.get_or_create_collection(name=f"ProductsRag", embedding_function=custom)
I can add products to DB and as it lasts some time it looks like it is using correct sentence transformer:
chroma_collection.add(
documents=products_list,
ids=ids
)
and then I can successfully query the cromadb:
queryStr = "Some question to ask"
results = chroma_collection.query(
query_texts=queryStr,
n_results=10,
include=['documents', 'distances']
)
print(len(results["documents"][0]))
i = 0
for value in results["documents"][0]:
i = i + 1
print(f"DOC {i}:")
print(value)