--- license: apache-2.0 title: ChanceRAG sdk: gradio emoji: 🚀 colorFrom: blue colorTo: purple sdk_version: 4.44.0 --- # **Chance RAG Documentation** ## **Overview:** ChanceRAG is a Retrieval-Augmented Generation (RAG) application designed to process documents (such as PDF and Docs) and retrieve relevant information to provide detailed and accurate responses based on user queries. The system leverages various retrieval techniques, including vector embeddings, annoy, BM25, and Word2Vec, and uses advanced fusion for reranking. The application integrates with Mistral’s embedding model for generating embeddings and employs Annoy for efficient retrieval using angular distance. ## **Data Flow:** ![Data Flow](images/Data_Flow.png) ### **1\. Document Processing and Embedding Storage:** * ### The user uploads a document (PDF), which is split into smaller chunks to stay within token limits. * ### Text from each page is extracted, chunked, and transformed into embeddings using the Mistral embedding model. * ### These embeddings are then stored in a vector database. ### **2\. Query Handling and Retrieval:** * ### Upon receiving a query, the system creates embeddings for the query and employs various retrieval methods, including Annoy, BM25, and Word2Vec, to fetch the most relevant document chunks. ### **3\. Re-ranking and Fusion:** * ### Retrieved document chunks are re-ranked using advanced fusion retrieval. * ### The highest-ranked results are used to generate a final response. ### **4\. Response Generation:** * ### Based on the retrieved context, ChanceRAG generates detailed, tailored responses using the Mistral AI API. * ### Users can customize the response style (e.g., Detailed, Concise, Creative, or Technical). ## **Components of the ChanceRAG System:** ### **1\. Document Processor:** * **store\_embeddings\_in\_vector\_db**: Processes and stores embeddings for PDFs. * **split\_text\_into\_chunks**: Splits text into manageable chunks. ### **2\. MistralRAGChatbot:** * **generate\_response\_with\_rag**: Manages the entire RAG process, including retrieval, reranking, and response generation. * **retrieve\_documents**: Fetches relevant document chunks using different retrieval methods. * **rerank\_documents**: Enhances retrieval relevance with reranking algorithms. ### **3\. Retrieval Engine:** * Utilizes methods like Annoy, BM25, Word2Vec to identify relevant content. **Retrieval Methods Comparison:** | Method | Speed | Accuracy | Memory Usage | | :---: | :---: | :---: | :---: | | Annoy | Fast | Good | Low | | BM25 | Fast | Good | Low | | Word2Vec | Slow | Good | High | ### **4\. Reranking Engine:** * Applies **advanced fusion** reranking method to ensure the most relevant documents are prioritized. * The advanced_fusion mechanism combines multiple retrieval methods (BM25 and Annoy) to rank documents more effectively. * The method retrieves documents using different retrieval methods (Annoy for nearest neighbors and BM25 for traditional ranking). * A similarity graph (sim_graph) is built by calculating cosine similarities between the embeddings of the documents. For any document pair where the cosine similarity is greater than 0.5, an edge is created in the graph with a weight equal to the similarity score. Then, the PageRank algorithm is applied to the similarity graph to score documents based on their relative importance in this network. * The fusion process combines the vector scores (Annoy), BM25 scores, and PageRank scores into a single score. Each component is given a weight: 0.5 for vector scores, 0.3 for BM25 scores, 0.2 for PageRank scores * The combined scores are normalized between 0 and 1 to make them more interpretable and comparable. The minimum score is mapped to 0 and the maximum to 1. * The documents are sorted based on the combined scores in descending order. The top 5 documents are returned with their text. **Advanced Fusion Flow:** ![Advanced Fusion Flow](images/Advanced_Fusion_Retrieval_Flowchart.png) ### **5\. Response Generator:** * ### The **build\_prompt** function constructs prompts for the Mistral AI model to generate human-like responses based on retrieved and reranked content. ## **User Interface:** Built with Gradio, the interface offers easy interaction with the system. Components include: * File Upload * User Query Input * Response Style Selection * Response Display. ![Landing Page](images/Landing_Page.png) ![Response_Style](images/Response_Style.png) ![RAG Response](images/RAG_Response.png) ## **Performance Optimization:** To enhance performance: * Implement caching for embeddings and frequently retrieved documents. * Use parallel processing for retrieval and reranking. ## **Metrics:** * **Precision Score:** The system achieved an average precision score of 80%, reflecting its accuracy in identifying relevant results. * **Hit Rate:** The hit rate was true, indicating successful identification of correct answers within the system's outputs. * **NDCG (Normalized Discounted Cumulative Gain):** The ranking quality was measured at 5, showcasing how well the system ranked relevant information. * **Hallucination:** No hallucinations were detected, confirming that the system provided factual and relevant responses. * **Correctness:** The correctness of the system’s outputs was validated as true, ensuring reliability in the information provided. ## **Error Handling and Logging:** The system includes error handling and logging mechanisms to ensure robustness and facilitate debugging. Key aspects: * Exception handling in critical functions * Logging of important events and errors * User-friendly error messages in the interface ## **Future Enhancements:** * Support for multi-document processing. * Dynamic model selection based on query complexity. * User feedback integration to improve retrieval and reranking. * Multilingual support for handling queries in different languages. * Advanced analytics to track performance and usage. ## **Conclusion:** The Chance RAG system provides a powerful and flexible solution for context-aware question answering based on documents. By leveraging multiple retrieval and reranking methods, along with the advanced language capabilities of the Mistral AI API, it offers highly relevant and coherent responses to user queries. This documentation provides a comprehensive overview of the system architecture, components, and processes. For specific implementation details, refer to the inline comments and docstrings in the code. [image1]: [image2]: [image3]: [image4]: