Financial RAG System
This repository documents a Financial Retrieval-Augmented Generation system for answering questions over SEC 10-K and 10-Q filings.
Live demo: https://huggingface.co/spaces/anasxs/financial-rag
GitHub: https://github.com/Anassbzdd/financial-RAG
System Overview
This is not a fine-tuned language model. It is a RAG system that combines document parsing, chunking, vector retrieval, keyword retrieval, reranking, and grounded answer generation.
The system answers questions using indexed SEC filings from:
- Apple
- Amazon
- Alphabet
- Berkshire Hathaway
- Johnson and Johnson
Architecture
SEC filings
-> LlamaParse markdown extraction
-> Recursive text chunking
-> BGE embeddings
-> ChromaDB vector search
-> BM25 keyword search
-> Reciprocal Rank Fusion
-> Cross-encoder reranking
-> Groq LLM generation
-> Gradio UI