metadata
title: FinAgent - Autonomous Financial AI
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: true
license: mit
π FinAgent: Autonomous Financial AI
An asynchronous, multi-agent LLM pipeline that automates quantitative financial research, fundamental document synthesis, earnings-call analysis, and real-time news sentiment scoring β built entirely with open-source models.
ποΈ Architecture
This system uses a deterministic state-machine architecture powered by LangGraph:
- Planner Agent β Parses the user query and generates a strict JSON task queue.
- Supervisor β A Python-controlled router that dispatches tasks to specialist agents.
- Specialist Agents:
- π’ Quant Agent β Live pricing, volume, and volatility metrics via
yfinance. - π Fundamental Agent β SEC XBRL accounting data + RAG on 10-K filings.
- π° Sentiment Agent β Real-time news headline analysis and scoring.
- ποΈ Earnings Agent β Sentiment divergence (Prepared Remarks vs Q&A) and keyword trend tracking from earnings-call transcripts.
- π’ Quant Agent β Live pricing, volume, and volatility metrics via
- Summarizer β Compiles all agent outputs into a unified Investment Memo.
π Try It
Type a query in the chat box β here are some examples:
| Query | What It Does |
|---|---|
| "How is Apple's stock doing?" | Quant analysis (price, volume, RSI) |
| "What are the manufacturing risks in Tesla's latest 10-K?" | RAG retrieval on SEC filings |
| "What is the market sentiment on Microsoft?" | Real-time news sentiment scoring |
| "Analyze the latest earnings call for AAPL β compare management tone in prepared remarks vs Q&A" | Earnings-call divergence analysis |
| "Compare the current stock performance of Microsoft and Google" | Multi-ticker parallel analysis |
π Pre-Loaded Data
This demo comes with pre-ingested data for immediate use:
- SEC 10-K Filings: AAPL, MSFT, TSLA, GOOGL, NVDA
- Earnings Call Transcripts: AAPL, MSFT (Q4-2024, Q1-2025)
Quantitative data (prices, volume) and sentiment (news) are fetched live β no pre-loading needed.
π οΈ Tech Stack
| Component | Technology |
|---|---|
| Orchestration | LangGraph / LangChain |
| LLM Inference | Groq API (Llama-3.1-8B-Instruct) |
| Frontend | Streamlit |
| Backend API | FastAPI + Uvicorn |
| Vector DB | ChromaDB |
| Embeddings | HuggingFace all-MiniLM-L6-v2 |
| Market Data | yfinance, SEC EDGAR API |
β‘ Performance Optimizations
This system was deliberately engineered for low-latency response times:
- Parallel Agent Dispatch β The Supervisor routes independent tasks to multiple specialist agents simultaneously (e.g., Quant + Sentiment + Fundamental in one batch) rather than sequentially, cutting multi-agent latency by up to 3Γ.
- Server-Sent Event (SSE) Streaming β Results stream live to the UI as each agent completes, so users see intermediate progress immediately instead of waiting for the full pipeline.
- Groq Cloud Inference β LLM calls use the Groq API (~200 tok/s on Llama-3.1-8B), eliminating local GPU bottlenecks and delivering sub-second per-agent response times.
- Singleton Embedding Cache β The HuggingFace embedding model is loaded once via
@lru_cacheand shared across all RAG queries (10-K, earnings, etc.), avoiding repeated 500MB+ model re-initialization. - Token Budget Tuning β
max_tokensis capped at 800 per LLM call to prevent Groq from reserving excessive context window, reducing queue wait times by ~40%. - Pre-Seeded Vector DB β ChromaDB collections are embedded at Docker build time, so the app starts with zero cold-start ingestion delay.
- Per-Step Latency Tracking β Every agent step reports wall-clock latency in the UI, making performance bottlenecks immediately visible.