File size: 6,225 Bytes
155ccbe 639f3bb 0943b9d 639f3bb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
---
title: Sema Chat API
emoji: π¬
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: mit
short_description: Chat with llms
---
# Sema Chat API π¬
Modern chatbot API with streaming capabilities, flexible model backends, and production-ready features. Built with FastAPI and designed for rapid GenAI advancements.
## π Quick Start with Gemma
### Option 1: Automated HuggingFace Spaces Deployment
```bash
cd backend/sema-chat
./setup_huggingface.sh
```
### Option 2: Manual Local Setup
```bash
cd backend/sema-chat
pip install -r requirements.txt
# Copy and configure environment
cp .env.example .env
# For Gemma via Google AI Studio (Recommended)
# Edit .env:
MODEL_TYPE=google
MODEL_NAME=gemma-2-9b-it
GOOGLE_API_KEY=your_google_api_key
# Run the API
uvicorn app.main:app --reload --host 0.0.0.0 --port 7860
```
### Option 3: Local Gemma (Free, No API Key)
```bash
# Edit .env:
MODEL_TYPE=local
MODEL_NAME=google/gemma-2b-it
DEVICE=cpu
# Run (will download model on first run)
uvicorn app.main:app --reload --host 0.0.0.0 --port 7860
```
## π Access Your API
Once running, access:
- **Swagger UI**: http://localhost:7860/
- **Health Check**: http://localhost:7860/api/v1/health
- **Chat Endpoint**: http://localhost:7860/api/v1/chat
## π§ͺ Quick Test
```bash
# Test chat
curl -X POST "http://localhost:7860/api/v1/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "Hello! Can you introduce yourself?",
"session_id": "test-session"
}'
# Test streaming
curl -N -H "Accept: text/event-stream" \
"http://localhost:7860/api/v1/chat/stream?message=Tell%20me%20about%20AI&session_id=test"
```
## π― Features
### Core Capabilities
- β
**Real-time Streaming**: Server-Sent Events and WebSocket support
- β
**Multiple Model Backends**: Local, HuggingFace API, OpenAI, Anthropic, Google AI, MiniMax
- β
**Session Management**: Persistent conversation contexts
- β
**Rate Limiting**: Built-in protection with configurable limits
- β
**Health Monitoring**: Comprehensive health checks and metrics
### Supported Models
- **Local**: TinyLlama, DialoGPT, Gemma, Qwen
- **Google AI**: Gemma-2-9b-it, Gemini-1.5-flash, Gemini-1.5-pro
- **OpenAI**: GPT-3.5-turbo, GPT-4, GPT-4-turbo
- **Anthropic**: Claude-3-haiku, Claude-3-sonnet, Claude-3-opus
- **HuggingFace API**: Any model via Inference API
- **MiniMax**: M1 model with reasoning capabilities
## π§ Configuration
### Environment Variables
```bash
# Model Backend (local, google, openai, anthropic, hf_api, minimax)
MODEL_TYPE=google
MODEL_NAME=gemma-2-9b-it
# API Keys (as needed)
GOOGLE_API_KEY=your_key
OPENAI_API_KEY=your_key
ANTHROPIC_API_KEY=your_key
HF_API_TOKEN=your_token
MINIMAX_API_KEY=your_key
# Generation Settings
TEMPERATURE=0.7
MAX_NEW_TOKENS=512
TOP_P=0.9
# Server Settings
HOST=0.0.0.0
PORT=7860
DEBUG=false
```
## π Documentation
- **[Configuration Guide](CONFIGURATION_GUIDE.md)** - Detailed setup for all backends
- **[HuggingFace Deployment](HUGGINGFACE_DEPLOYMENT.md)** - Step-by-step deployment guide
- **[API Documentation](http://localhost:7860/)** - Interactive Swagger UI
## π§ͺ Testing
```bash
# Run comprehensive tests
python tests/test_api.py
# Test different backends
python examples/test_backends.py
# Test specific backend
python examples/test_backends.py --backend google
```
## π Deployment
### HuggingFace Spaces (Recommended)
1. Run the setup script: `./setup_huggingface.sh`
2. Create your Space on HuggingFace
3. Push the generated code
4. Set environment variables in Space settings
5. Your API will be live at: `https://username-spacename.hf.space/`
### Docker
```bash
docker build -t sema-chat-api .
docker run -e MODEL_TYPE=google \
-e GOOGLE_API_KEY=your_key \
-p 7860:7860 \
sema-chat-api
```
## π API Endpoints
### Chat
- **`POST /api/v1/chat`** - Send chat message
- **`GET /api/v1/chat/stream`** - Streaming chat (SSE)
- **`WebSocket /api/v1/chat/ws`** - Real-time WebSocket chat
### Sessions
- **`GET /api/v1/sessions/{id}`** - Get conversation history
- **`DELETE /api/v1/sessions/{id}`** - Clear conversation
- **`GET /api/v1/sessions`** - List active sessions
### System
- **`GET /api/v1/health`** - Comprehensive health check
- **`GET /api/v1/model/info`** - Current model information
- **`GET /api/v1/status`** - Basic status
## π‘ Why This Architecture?
1. **Future-Proof**: Modular design adapts to rapid GenAI advancements
2. **Flexible**: Switch between local models and APIs with environment variables
3. **Production-Ready**: Rate limiting, monitoring, error handling built-in
4. **Cost-Effective**: Start free with local models, scale with APIs
5. **Developer-Friendly**: Comprehensive docs, tests, and examples
## π οΈ Development
### Project Structure
```
app/
βββ main.py # FastAPI application
βββ api/v1/endpoints.py # API routes
βββ core/
β βββ config.py # Environment-based configuration
β βββ logging.py # Structured logging
βββ models/schemas.py # Pydantic request/response models
βββ services/
β βββ chat_manager.py # Chat orchestration
β βββ model_manager.py # Backend selection
β βββ session_manager.py # Conversation management
β βββ model_backends/ # Model implementations
βββ utils/helpers.py # Utility functions
```
### Adding New Backends
1. Create new backend in `app/services/model_backends/`
2. Inherit from `ModelBackend` base class
3. Implement required methods
4. Add to `ModelManager._create_backend()`
5. Update configuration and documentation
## π€ Contributing
1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Ensure all tests pass
5. Submit a pull request
## π License
MIT License - see LICENSE file for details.
## π Acknowledgments
- **HuggingFace** for model hosting and Spaces platform
- **Google** for Gemma models and AI Studio
- **FastAPI** for the excellent web framework
- **OpenAI, Anthropic, MiniMax** for their APIs
---
**Ready to chat? Deploy your Sema Chat API today! ππ¬**
|