metadata

title: 🧠 Personal AI Second Brain
emoji: 🤗
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: true
license: mit

🧠 Personal AI Second Brain

A personalized AI assistant that serves as your second brain, built with Hugging Face, Streamlit, and Telegram integration. This system helps you store and retrieve information from your documents, conversations, and notes through a powerful Retrieval-Augmented Generation (RAG) system.

Features

Chat Interface: Ask questions and get answers based on your personal knowledge base
Document Management: Upload and process documents (PDF, TXT, DOC, etc.)
RAG System: Retrieve relevant information from your knowledge base
Telegram Integration: Access your second brain through Telegram
Persistent Chat History: Store conversations in Hugging Face Datasets
Expandable: Easy to add new data sources and functionalities

Architecture

The system is built with the following components:

LLM Layer: Uses Hugging Face models for text generation and embeddings
Memory Layer: Vector database (Qdrant) for storing and retrieving information
RAG System: Retrieval-Augmented Generation to ground answers in your data
Ingestion Pipeline: Process documents and chat history
Telegram Bot: Integration with Telegram for chat-based access
Hugging Face Dataset: Persistent storage for chat history

Setup

Requirements

Python 3.8+
Hugging Face account (for model access and hosting)
Telegram account (for bot integration, optional)

Installation

Clone the repository:

git clone <repository-url>
cd personal-ai-second-brain

Install dependencies:
```
pip install -r requirements.txt
```

Create a .env file with your configuration:

# API Keys
HF_API_KEY=your_huggingface_api_key_here
TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here

# LLM Configuration
LLM_MODEL=gpt2  # Use small model for Hugging Face Spaces
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2

# Vector Database
VECTOR_DB_PATH=./data/vector_db
COLLECTION_NAME=personal_assistant

# Application Settings
DEFAULT_TEMPERATURE=0.7
CHUNK_SIZE=512
CHUNK_OVERLAP=128
MAX_TOKENS=256

# Telegram Bot Settings
TELEGRAM_ENABLED=false
TELEGRAM_ALLOWED_USERS=  # Comma-separated list of Telegram user IDs

# Hugging Face Dataset Settings
HF_DATASET_NAME=username/second-brain-history  # Your username/dataset-name
CHAT_HISTORY_DIR=./data/chat_history
SYNC_INTERVAL=60  # How often to sync history to HF (minutes)

Create necessary directories:

mkdir -p data/documents data/vector_db data/chat_history

Running Locally

Start the application:

streamlit run app/ui/streamlit_app.py

Deploying to Hugging Face Spaces

Create a new Space on Hugging Face
Upload the code to the Space
Set the environment variables in the Space settings
The application will automatically start

Telegram Bot Setup

Talk to @BotFather on Telegram
Use the /newbot command to create a new bot
Get your bot token and add it to your .env file
Set TELEGRAM_ENABLED=true in your .env file
To find your Telegram user ID (for restricting access), talk to @userinfobot

Telegram Commands

/start: Start a conversation with the bot
/help: Shows available commands
/search: Use /search your query to search your knowledge base
Direct messages: Send any message to chat with your second brain

Hugging Face Dataset Integration

To enable persistent chat history across deployments:

Create a private dataset repository on Hugging Face Hub
Set your API token in the .env file as HF_API_KEY
Set your dataset name as HF_DATASET_NAME (format: username/repo-name)

Customization

Using Different Models

You can change the models by updating the .env file:

LLM_MODEL=mistralai/Mistral-7B-Instruct-v0.2
EMBEDDING_MODEL=sentence-transformers/all-mpnet-base-v2

Adding Custom Tools

To add custom tools to your agent, modify the app/core/agent.py file to include additional functionality.

Roadmap

Web search tool integration
Calendar and email integration
Voice interface
Mobile app integration
Fine-tuning for personalized responses

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Created by p3rc03