Paperbot6 / README.md
Ina-Shapiro's picture
Refactor app.py to enhance paper fetching functionality and improve error handling. Update README.md to reflect new features and usage instructions. Remove dotenv dependency from requirements.txt.
028ef27

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: AI Research Paper Chatbot
emoji: πŸ“š
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: false
app_port: 7860

πŸ“š AI Research Paper Chatbot

A modern conversational AI chatbot designed specifically for exploring and analyzing AI research papers. Features full paper text access, conversation memory, real-time streaming, and intelligent paper search.

✨ Latest Features

  • πŸ“– Smart Function Calling: Intelligent paper retrieval using OpenAI's function calling API
  • πŸ” Dynamic Paper Fetching: Automatically fetches full paper texts when needed
  • 🧠 Contextual Conversation Memory: Maintains chat history with intelligent truncation
  • πŸš€ Real-time Streaming: Instant response streaming for better UX
  • πŸŽ›οΈ Multiple Model Selection: Choose between GPT-4o, GPT-4o-mini, and GPT-3.5 Turbo
  • βš™οΈ Advanced Parameters: Fine-tune temperature, max tokens, and top-p
  • 🎨 Modern UI: Responsive design with intuitive controls
  • πŸ›‘οΈ Robust Error Handling: Clear error messages for common issues
  • πŸ“± Mobile Responsive: Works great on all devices

πŸš€ Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Get OpenAI API Key

  1. Visit OpenAI Platform
  2. Create an account or sign in
  3. Generate a new API key
  4. Copy the API key

3. Configure Environment

For Local Development

Set your OpenAI API key as an environment variable:

Windows (PowerShell):

$env:OPENAI_API_KEY="your_openai_api_key_here"

Windows (Command Prompt):

set OPENAI_API_KEY=your_openai_api_key_here

Linux/macOS:

export OPENAI_API_KEY="your_openai_api_key_here"

For Hugging Face Spaces Deployment

  1. Go to your Space settings
  2. Click on "Settings" tab
  3. Scroll down to "Repository secrets"
  4. Click "New secret"
  5. Name: OPENAI_API_KEY
  6. Value: Your actual OpenAI API key
  7. Click "Add secret"

Important: Replace your_openai_api_key_here with your actual OpenAI API key.

4. Add Your Papers

Place your research paper text files in the Papers/ directory. The system will automatically load all .txt files.

5. Run the Application

python app.py

The chatbot will be available at http://localhost:7860

🎯 Usage Guide

Basic Paper Exploration

  1. Ask about specific topics: "What papers discuss AI's impact on employment?"
  2. Request full papers: "Show me the full paper about AI companions"
  3. Get detailed information: "What's the conclusion of the pig disease detection paper?"
  4. Compare findings: "Compare findings on AI in education"
  5. Ask for specific details: "What methodology did they use in the pig disease paper?"

Advanced Controls

Model Selection

  • GPT-4o-mini: Fast, cost-effective (default)
  • GPT-4o: Most capable, higher cost
  • GPT-3.5 Turbo: Fastest, most affordable

Parameter Tuning

  • System Message: Define AI personality and behavior
  • Max Tokens: Control response length (1-4096)
  • Temperature: Adjust creativity (0.0 = focused, 2.0 = creative)
  • Top-p: Control response diversity (0.0-1.0)

Conversation Management

  • Clear Button: Reset conversation history
  • Example Buttons: Quick-start with sample messages

πŸ“š Paper Database Features

Automatic Paper Loading

  • All .txt files in the Papers/ directory are automatically loaded
  • Paper titles are extracted from filenames
  • Full text content is available for detailed analysis

Intelligent Search

  • Keyword Matching: Finds papers based on user query terms
  • Relevance Scoring: Ranks papers by relevance to the query
  • Context-Aware: Provides relevant paper excerpts for detailed responses

Full Paper Access

  • Complete Text: Access entire paper content when requested
  • Direct Quotes: Get exact quotes from papers
  • Detailed Analysis: Comprehensive answers including conclusions and methodology

πŸ”§ Technical Details

Latest OpenAI API Features

  • OpenAI SDK v1.98.0+: Latest API patterns and features
  • Streaming Responses: Real-time token streaming
  • Smart Retry Logic: Automatic retry on failures
  • Timeout Handling: 60-second request timeout
  • Error Classification: Specific error messages for different issues

Paper Processing

  • Automatic Loading: Papers loaded at startup for fast access
  • Smart Search: Keyword-based relevance scoring
  • Content Truncation: Intelligent content selection for context
  • Full Text Access: Complete paper retrieval when needed

Conversation Memory

  • Intelligent Truncation: Keeps recent messages while staying within limits
  • System Message Preservation: Always maintains AI personality
  • Context Awareness: Full conversation history for contextual responses

Performance Optimizations

  • Async Processing: Non-blocking UI during API calls
  • Memory Management: Efficient conversation history handling
  • Error Recovery: Graceful handling of API failures

πŸ› οΈ Configuration

Environment Variables

OPENAI_API_KEY=your_api_key_here

Model Parameters

# Available models
AVAILABLE_MODELS = {
    "GPT-4o-mini": "gpt-4o-mini",
    "GPT-4o": "gpt-4o", 
    "GPT-3.5 Turbo": "gpt-3.5-turbo"
}

Paper Directory Structure

Papers/
β”œβ”€β”€ Paper Title 1.txt
β”œβ”€β”€ Paper Title 2.txt
└── ...

πŸ› Troubleshooting

Common Issues

API Key Errors

  • Ensure your OPENAI_API_KEY environment variable is set correctly
  • Check that the API key has sufficient credits
  • For Hugging Face Spaces: Verify the secret is named OPENAI_API_KEY

Paper Loading Issues

  • Ensure papers are in .txt format
  • Check that the Papers/ directory exists
  • Verify file encoding (UTF-8 recommended)

Rate Limiting

  • Wait a moment and try again
  • Consider using a different model

Connection Issues

Memory Issues

  • Conversation history is maintained in memory during the session
  • Long conversations are automatically truncated

Error Messages

  • "Invalid API key": Check your environment variable or Hugging Face Spaces secrets
  • "Quota exceeded": Add credits to your OpenAI account
  • "Rate limit": Wait and retry
  • "Paper not found": Check that the paper file exists in the Papers directory

πŸ“Š Model Comparison

Model Speed Cost Capability Best For
GPT-4o-mini Fast Low Good General chat, quick responses
GPT-4o Medium High Excellent Complex tasks, detailed analysis
GPT-3.5 Turbo Fastest Lowest Good Simple queries, high volume

πŸ”„ Recent Updates

  • βœ… Added full paper text access functionality
  • βœ… Implemented intelligent paper search
  • βœ… Added automatic paper loading from Papers directory
  • βœ… Enhanced system prompt with paper content
  • βœ… Added example buttons for paper exploration
  • βœ… Updated to OpenAI SDK v1.98.0+
  • βœ… Added multiple model selection
  • βœ… Improved error handling and messages
  • βœ… Enhanced conversation memory management
  • βœ… Added smart conversation truncation
  • βœ… Modernized UI with better responsive design
  • βœ… Fixed Pydantic compatibility issues
  • βœ… Improved Hugging Face Spaces deployment

πŸ“ License

This project is open source and available under the MIT License.