SummarizerApp / README_HF.md
ming
Fix Ollama permissions and model configuration for Hugging Face deployment
bd3417a
---
title: Text Summarizer API
emoji: πŸ“
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860
---
# Text Summarizer API
A FastAPI-based text summarization service powered by Ollama and Mistral 7B model.
## πŸš€ Features
- **Fast text summarization** using local LLM inference
- **RESTful API** with FastAPI
- **Health monitoring** and logging
- **Docker containerized** for easy deployment
- **Free deployment** on Hugging Face Spaces
## πŸ“‘ API Endpoints
### Health Check
```
GET /health
```
### Summarize Text
```
POST /api/v1/summarize
Content-Type: application/json
{
"text": "Your long text to summarize here...",
"max_tokens": 256,
"temperature": 0.7
}
```
### API Documentation
- **Swagger UI**: `/docs`
- **ReDoc**: `/redoc`
## πŸ”§ Configuration
The service uses the following environment variables:
- `OLLAMA_MODEL`: Model to use (default: `mistral:7b`)
- `OLLAMA_HOST`: Ollama service host (default: `http://localhost:11434`)
- `OLLAMA_TIMEOUT`: Request timeout in seconds (default: `30`)
- `SERVER_HOST`: Server host (default: `0.0.0.0`)
- `SERVER_PORT`: Server port (default: `7860`)
- `LOG_LEVEL`: Logging level (default: `INFO`)
## 🐳 Docker Deployment
### Local Development
```bash
# Build and run with docker-compose
docker-compose up --build
# Or run directly
docker build -f Dockerfile.hf -t summarizer-app .
docker run -p 7860:7860 summarizer-app
```
### Hugging Face Spaces
This app is configured for deployment on Hugging Face Spaces using Docker SDK.
## πŸ“Š Performance
- **Model**: Mistral 7B (7GB RAM requirement)
- **Startup time**: ~2-3 minutes (includes model download)
- **Inference speed**: ~2-5 seconds per request
- **Memory usage**: ~8GB RAM
## πŸ› οΈ Development
### Setup
```bash
# Install dependencies
pip install -r requirements.txt
# Run locally
uvicorn app.main:app --host 0.0.0.0 --port 7860
```
### Testing
```bash
# Run tests
pytest
# Run with coverage
pytest --cov=app
```
## πŸ“ Usage Examples
### Python
```python
import requests
# Summarize text
response = requests.post(
"https://your-space.hf.space/api/v1/summarize",
json={
"text": "Your long article or text here...",
"max_tokens": 256
}
)
result = response.json()
print(result["summary"])
```
### cURL
```bash
curl -X POST "https://your-space.hf.space/api/v1/summarize" \
-H "Content-Type: application/json" \
-d '{
"text": "Your text to summarize...",
"max_tokens": 256
}'
```
## πŸ”’ Security
- Non-root user execution
- Input validation and sanitization
- Rate limiting (configurable)
- API key authentication (optional)
## πŸ“ˆ Monitoring
The service includes:
- Health check endpoint
- Request logging
- Error tracking
- Performance metrics
## πŸ†˜ Troubleshooting
### Common Issues
1. **Model not loading**: Check if Ollama is running and model is pulled
2. **Out of memory**: Ensure sufficient RAM (8GB+) for Mistral 7B
3. **Slow startup**: Normal on first run due to model download
4. **API errors**: Check logs via `/docs` endpoint
### Logs
View application logs in the Hugging Face Spaces interface or check the health endpoint for service status.
## πŸ“„ License
MIT License - see LICENSE file for details.
## 🀝 Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a pull request
---
**Deployed on Hugging Face Spaces** πŸš€