Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.0.0
metadata
title: Phi-3.5-MoE Expert Assistant
emoji: π€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
entrypoint: start.sh
startup_duration_timeout: 600
pinned: false
license: mit
short_description: AI assistant with expert routing and CPU/GPU support
models:
- microsoft/Phi-3.5-MoE-instruct
π€ Phi-3.5-MoE Expert Assistant
A robust, production-ready AI assistant powered by Microsoft's Phi-3.5-MoE model with intelligent expert routing and comprehensive CPU/GPU environment support.
π Key Features
- π§ Expert Routing: Automatically routes queries to specialized experts (Code, Math, Reasoning, Multilingual, General)
- π§ Environment Adaptive: Works seamlessly on both CPU and GPU environments
- π‘οΈ Robust Dependency Management: Conditional installation of dependencies based on environment
- π¦ Fault Tolerance: Handles missing dependencies with fallback mechanisms
- β‘ Performance Optimized: Environment-specific optimizations for best performance
π§ Recent Fixes
- β
Missing Dependencies: Added
einopsto requirements, conditionalflash_attninstallation - β
Deprecated Parameters: Fixed all
torch_dtypeβdtypeusage - β CPU Compatibility: Automatic CPU-safe model revision selection
- β Error Handling: Comprehensive fallback mechanisms
- β Security: Updated to Gradio 4.44.0+ for security fixes
ποΈ Architecture
app.py # Main application entry point
preinstall.py # Pre-installation script for dependencies
model_patch.py # Patch for handling missing dependencies
start.sh # Startup script
requirements.txt # Core dependencies
π― How It Works
- Environment Detection: Automatically detects CPU vs GPU environment
- Dependency Management: Installs required dependencies based on environment
- Model Configuration: Uses optimal settings for each environment
- Expert Routing: Classifies queries and routes to appropriate expert
- Graceful Fallbacks: Works even when dependencies are missing
π Performance
| Environment | Startup | Memory | Tokens/sec |
|---|---|---|---|
| CPU | 3-5 min | 8-12 GB | 2-5 |
| GPU | 2-3 min | 16-20 GB | 15-30 |
π Troubleshooting
If you encounter issues:
- Check the logs for dependency installation
- Verify the pre-installation script executed successfully
- Ensure all required packages are installed
- Try the fallback mode if model loading fails
Built with β€οΈ for reliable, production-ready AI applications