title: 🧠 Personal AI Second Brain
emoji: 🤗
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: true
license: mit
🧠 Personal AI Second Brain
A personalized AI assistant that serves as your second brain, built with Hugging Face, Streamlit, and Telegram integration. This system helps you store and retrieve information from your documents, conversations, and notes through a powerful Retrieval-Augmented Generation (RAG) system.
Features
- Chat Interface: Ask questions and get answers based on your personal knowledge base
- Document Management: Upload and process documents (PDF, TXT, DOC, etc.)
- RAG System: Retrieve relevant information from your knowledge base
- Telegram Integration: Access your second brain through Telegram
- Persistent Chat History: Store conversations in Hugging Face Datasets
- Expandable: Easy to add new data sources and functionalities
Architecture
The system is built with the following components:
- LLM Layer: Uses Hugging Face models for text generation and embeddings
- Memory Layer: Vector database (Qdrant) for storing and retrieving information
- RAG System: Retrieval-Augmented Generation to ground answers in your data
- Ingestion Pipeline: Process documents and chat history
- Telegram Bot: Integration with Telegram for chat-based access
- Hugging Face Dataset: Persistent storage for chat history
Setup
Requirements
- Python 3.8+
- Hugging Face account (for model access and hosting)
- Telegram account (for bot integration, optional)
Installation
Clone the repository:
git clone <repository-url> cd personal-ai-second-brain
Install dependencies:
pip install -r requirements.txt
Create a
.env
file with your configuration:# API Keys HF_API_KEY=your_huggingface_api_key_here TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here # LLM Configuration LLM_MODEL=gpt2 # Use small model for Hugging Face Spaces EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 # Vector Database VECTOR_DB_PATH=./data/vector_db COLLECTION_NAME=personal_assistant # Application Settings DEFAULT_TEMPERATURE=0.7 CHUNK_SIZE=512 CHUNK_OVERLAP=128 MAX_TOKENS=256 # Telegram Bot Settings TELEGRAM_ENABLED=false TELEGRAM_ALLOWED_USERS= # Comma-separated list of Telegram user IDs # Hugging Face Dataset Settings HF_DATASET_NAME=username/second-brain-history # Your username/dataset-name CHAT_HISTORY_DIR=./data/chat_history SYNC_INTERVAL=60 # How often to sync history to HF (minutes)
Create necessary directories:
mkdir -p data/documents data/vector_db data/chat_history
Running Locally
Start the application:
streamlit run app/ui/streamlit_app.py
Deploying to Hugging Face Spaces
- Create a new Space on Hugging Face
- Upload the code to the Space
- Set the environment variables in the Space settings
- The application will automatically start
Telegram Bot Setup
- Talk to @BotFather on Telegram
- Use the
/newbot
command to create a new bot - Get your bot token and add it to your
.env
file - Set
TELEGRAM_ENABLED=true
in your.env
file - To find your Telegram user ID (for restricting access), talk to @userinfobot
Telegram Commands
- /start: Start a conversation with the bot
- /help: Shows available commands
- /search: Use
/search your query
to search your knowledge base - Direct messages: Send any message to chat with your second brain
Hugging Face Dataset Integration
To enable persistent chat history across deployments:
- Create a private dataset repository on Hugging Face Hub
- Set your API token in the
.env
file asHF_API_KEY
- Set your dataset name as
HF_DATASET_NAME
(format: username/repo-name)
Customization
Using Different Models
You can change the models by updating the .env
file:
LLM_MODEL=mistralai/Mistral-7B-Instruct-v0.2
EMBEDDING_MODEL=sentence-transformers/all-mpnet-base-v2
Adding Custom Tools
To add custom tools to your agent, modify the app/core/agent.py
file to include additional functionality.
Roadmap
- Web search tool integration
- Calendar and email integration
- Voice interface
- Mobile app integration
- Fine-tuning for personalized responses
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Created by p3rc03