--- title: Lithuanian Law RAG QA ChatBot Streamlit emoji: 📊 colorFrom: green colorTo: yellow sdk: streamlit sdk_version: 1.36.0 app_file: app.py pinned: false license: apache-2.0 --- # Chat with Lithuanian Law Documents This is a README file for a Streamlit application that allows users to chat with a virtual assistant based on Lithuanian law documents, leveraging local processing power and a compact language model. ## Important Disclaimer This application utilizes a lightweight large language model (LLM) called Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf to ensure smooth local processing on your device. While this model offers efficiency benefits, it comes with some limitations: #### Potential for Hallucination: Due to its size and training data, the model might occasionally generate responses that are not entirely consistent with the provided documents or factual accuracy. #### Character Misinterpretations: In rare instances, the model may introduce nonsensical characters, including those from the Chinese alphabet, during response generation. We recommend keeping these limitations in mind when using the application and interpreting the provided responses. ## Features Users can choose the information retrieval type (similarity or maximum marginal relevance search). Users can specify the number of documents to retrieve. Users can ask questions about the provided documents. The virtual assistant provides answers based on the retrieved documents and a powerful, yet environmentally friendly, large language model (LLM). Technical Details #### Sentence Similarity: The application utilizes the Alibaba-NLP/gte-base-en-v1.5 model for efficient sentence embedding, allowing for semantic similarity comparisons between user queries and the legal documents. #### Local Vector Store: chroma acts as a local vector store, efficiently storing and managing the document embeddings for fast retrieval. #### RAG Chain with Quantized LLM: A Retrieval-Augmented Generation (RAG) chain is implemented to process user queries. This chain integrates two key components: #### Lightweight LLM: To ensure local operation, the application employs a compact LLM, specifically JCHAVEROT_Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf, with only 0.5 billion parameters. This LLM is specifically designed for question answering tasks. #### Quantization: This Qwen2 model leverages a technique called quantization, which reduces the model size without sacrificing significant accuracy. This quantization process makes the model more efficient to run on local hardware, contributing to a more environmentally friendly solution. #### CPU-based Processing: The entire application is currently implemented to function entirely on your CPU. While utilizing a GPU could significantly improve processing speed, this CPU-based approach allows the application to run effectively on a wider range of devices. Benefits of Compact Design #### Local Processing: The compact size of the LLM and the application itself enable local processing on your device, reducing reliance on cloud-based resources and associated environmental impact. Mobile Potential: Due to its small footprint, this application has the potential to be adapted for mobile devices, bringing legal information access to a wider audience. Adaptability of Qwen2 0.5B #### Fine-tuning: While the Qwen2 0.5B model is powerful for its size, it can be further enhanced through fine-tuning on specific legal datasets or domains, potentially improving its understanding of Lithuanian legal terminology and nuances. Conversation Style: Depending on user needs and desired conversation style, alternative pre-trained models could be explored, potentially offering a trade-off between model size and specific capabilities. #### Requirements Streamlit langchain langchain-community chromadb transformers Running the application Install the required libraries. Set the environment variable lang_api_key with your Langchain API key (if applicable). Run streamlit run main.py. #### Code Structure create_retriever_from_chroma: Creates a document retriever using Chroma and the Alibaba-NLP/gte-base-en-v1.5 model for sentence similarity. main: Defines the Streamlit application layout and functionalities. handle_userinput: Processes user input, retrieves relevant documents, and generates a response using the compressed LLM retriever within the RAG chain. create_conversational_rag_chain: Creates a RAG chain for processing user questions with the compressed LLM retriever. Additional Notes ## Potential Improvements While the application functions effectively, there's room for future enhancements: #### Advanced Retrieval Techniques: Explore implementing more sophisticated retrieval methods beyond the current approach. This could involve techniques like self-corrective Retrieval-Augmented Generation (RAG), multi-vector RAG, or graph RAG, potentially leading to improved accuracy and more relevant results. #### Expanded Data Sources: Consider incorporating a wider range of data sources beyond the initial Lithuanian law documents. This could encompass legal databases, relevant news articles, or judicial opinions. If no pertinent information is found within these sources, the application could potentially resort to web searches for a broader perspective. #### GPU Acceleration: The application on a GPU to leverage its processing power. This could significantly reduce response times, enhancing user experience. #### Model Fine-tuning: Explore fine-tuning the Qwen2-0.5B model on specific legal datasets or domains. This could significantly bolster its understanding of Lithuanian legal terminology and nuances, leading to more accurate and insightful responses. #### Multi-agent Approach: Consider adopting a multi-agent approach. This could involve integrating additional tools and functionalities, such as data visualization tools or legal document summarization capabilities, to further enrich the user experience. ## Benefits of Future Advancements #### Enhanced Accuracy: Advanced retrieval techniques could provide more precise and relevant results to user queries. #### Comprehensive Information Access: Integrating additional data sources would broaden the information scope, offering users a more comprehensive picture. #### Faster Response Times: GPU implementation could significantly reduce processing times, leading to a more responsive application. #### Improved Legal Understanding: Fine-tuning the model would enhance its comprehension of Lithuanian legal concepts, leading to more accurate and informative responses. #### Richer User Experience: A multi-agent approach could introduce new functionalities and data visualization tools, fostering a more interactive and informative experience. ## Conclusion This application provides a valuable foundation for legal information access. By exploring the potential improvements outlined above, we can continuously enhance its capabilities and user experience.