Spaces:

Ina-Shapiro
/

Paperbot6

Sleeping

Paperbot6 / README.md

Refactor app.py to enhance paper fetching functionality and improve error handling. Update README.md to reflect new features and usage instructions. Remove dotenv dependency from requirements.txt.

028ef27 6 months ago

preview code

raw

history blame contribute delete

7.6 kB

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

metadata

title: AI Research Paper Chatbot
emoji: 📚
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: false
app_port: 7860

📚 AI Research Paper Chatbot

A modern conversational AI chatbot designed specifically for exploring and analyzing AI research papers. Features full paper text access, conversation memory, real-time streaming, and intelligent paper search.

✨ Latest Features

📖 Smart Function Calling: Intelligent paper retrieval using OpenAI's function calling API
🔍 Dynamic Paper Fetching: Automatically fetches full paper texts when needed
🧠 Contextual Conversation Memory: Maintains chat history with intelligent truncation
🚀 Real-time Streaming: Instant response streaming for better UX
🎛️ Multiple Model Selection: Choose between GPT-4o, GPT-4o-mini, and GPT-3.5 Turbo
⚙️ Advanced Parameters: Fine-tune temperature, max tokens, and top-p
🎨 Modern UI: Responsive design with intuitive controls
🛡️ Robust Error Handling: Clear error messages for common issues
📱 Mobile Responsive: Works great on all devices

🚀 Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Get OpenAI API Key

Visit OpenAI Platform
Create an account or sign in
Generate a new API key
Copy the API key

3. Configure Environment

For Local Development

Set your OpenAI API key as an environment variable:

Windows (PowerShell):

$env:OPENAI_API_KEY="your_openai_api_key_here"

Windows (Command Prompt):

set OPENAI_API_KEY=your_openai_api_key_here

Linux/macOS:

export OPENAI_API_KEY="your_openai_api_key_here"

For Hugging Face Spaces Deployment

Go to your Space settings
Click on "Settings" tab
Scroll down to "Repository secrets"
Click "New secret"
Name: OPENAI_API_KEY
Value: Your actual OpenAI API key
Click "Add secret"

Important: Replace your_openai_api_key_here with your actual OpenAI API key.

4. Add Your Papers

Place your research paper text files in the Papers/ directory. The system will automatically load all .txt files.

5. Run the Application

python app.py

The chatbot will be available at http://localhost:7860

🎯 Usage Guide

Basic Paper Exploration

Ask about specific topics: "What papers discuss AI's impact on employment?"
Request full papers: "Show me the full paper about AI companions"
Get detailed information: "What's the conclusion of the pig disease detection paper?"
Compare findings: "Compare findings on AI in education"
Ask for specific details: "What methodology did they use in the pig disease paper?"

Advanced Controls

Model Selection

GPT-4o-mini: Fast, cost-effective (default)
GPT-4o: Most capable, higher cost
GPT-3.5 Turbo: Fastest, most affordable

Parameter Tuning

System Message: Define AI personality and behavior
Max Tokens: Control response length (1-4096)
Temperature: Adjust creativity (0.0 = focused, 2.0 = creative)
Top-p: Control response diversity (0.0-1.0)

Conversation Management

Clear Button: Reset conversation history
Example Buttons: Quick-start with sample messages

📚 Paper Database Features

Automatic Paper Loading

All .txt files in the Papers/ directory are automatically loaded
Paper titles are extracted from filenames
Full text content is available for detailed analysis

Intelligent Search

Keyword Matching: Finds papers based on user query terms
Relevance Scoring: Ranks papers by relevance to the query
Context-Aware: Provides relevant paper excerpts for detailed responses

Full Paper Access

Complete Text: Access entire paper content when requested
Direct Quotes: Get exact quotes from papers
Detailed Analysis: Comprehensive answers including conclusions and methodology

🔧 Technical Details

Latest OpenAI API Features

OpenAI SDK v1.98.0+: Latest API patterns and features
Streaming Responses: Real-time token streaming
Smart Retry Logic: Automatic retry on failures
Timeout Handling: 60-second request timeout
Error Classification: Specific error messages for different issues

Paper Processing

Automatic Loading: Papers loaded at startup for fast access
Smart Search: Keyword-based relevance scoring
Content Truncation: Intelligent content selection for context
Full Text Access: Complete paper retrieval when needed

Conversation Memory

Intelligent Truncation: Keeps recent messages while staying within limits
System Message Preservation: Always maintains AI personality
Context Awareness: Full conversation history for contextual responses

Performance Optimizations

Async Processing: Non-blocking UI during API calls
Memory Management: Efficient conversation history handling
Error Recovery: Graceful handling of API failures

🛠️ Configuration

Environment Variables

OPENAI_API_KEY=your_api_key_here

Model Parameters

# Available models
AVAILABLE_MODELS = {
    "GPT-4o-mini": "gpt-4o-mini",
    "GPT-4o": "gpt-4o", 
    "GPT-3.5 Turbo": "gpt-3.5-turbo"
}

Paper Directory Structure

Papers/
├── Paper Title 1.txt
├── Paper Title 2.txt
└── ...

🐛 Troubleshooting

Common Issues

API Key Errors

Ensure your OPENAI_API_KEY environment variable is set correctly
Check that the API key has sufficient credits
For Hugging Face Spaces: Verify the secret is named OPENAI_API_KEY

Paper Loading Issues

Ensure papers are in .txt format
Check that the Papers/ directory exists
Verify file encoding (UTF-8 recommended)

Rate Limiting

Wait a moment and try again
Consider using a different model

Connection Issues

Check your internet connection
Verify OpenAI API status at https://status.openai.com

Memory Issues

Conversation history is maintained in memory during the session
Long conversations are automatically truncated

Error Messages

"Invalid API key": Check your environment variable or Hugging Face Spaces secrets
"Quota exceeded": Add credits to your OpenAI account
"Rate limit": Wait and retry
"Paper not found": Check that the paper file exists in the Papers directory

📊 Model Comparison

Model	Speed	Cost	Capability	Best For
GPT-4o-mini	Fast	Low	Good	General chat, quick responses
GPT-4o	Medium	High	Excellent	Complex tasks, detailed analysis
GPT-3.5 Turbo	Fastest	Lowest	Good	Simple queries, high volume

🔄 Recent Updates

✅ Added full paper text access functionality
✅ Implemented intelligent paper search
✅ Added automatic paper loading from Papers directory
✅ Enhanced system prompt with paper content
✅ Added example buttons for paper exploration
✅ Updated to OpenAI SDK v1.98.0+
✅ Added multiple model selection
✅ Improved error handling and messages
✅ Enhanced conversation memory management
✅ Added smart conversation truncation
✅ Modernized UI with better responsive design
✅ Fixed Pydantic compatibility issues
✅ Improved Hugging Face Spaces deployment

📝 License

This project is open source and available under the MIT License.