Spaces:
Running
Running
ming
commited on
Commit
Β·
3777705
1
Parent(s):
84d7a68
docs: Update README.md with accurate model and configuration info
Browse files- Update default model from mistral:7b to llama3.2:1b
- Fix environment variable defaults (OLLAMA_HOST, SERVER_PORT, OLLAMA_TIMEOUT)
- Update performance metrics (RAM: 7GBβ1GB, startup time, inference speed)
- Correct API request schema (remove temperature, add prompt parameter)
- Fix Docker command references
- Update development setup instructions
- Update troubleshooting section with correct RAM requirements
README.md
CHANGED
|
@@ -11,7 +11,7 @@ app_port: 7860
|
|
| 11 |
|
| 12 |
# Text Summarizer API
|
| 13 |
|
| 14 |
-
A FastAPI-based text summarization service powered by Ollama and
|
| 15 |
|
| 16 |
## π Features
|
| 17 |
|
|
@@ -36,7 +36,7 @@ Content-Type: application/json
|
|
| 36 |
{
|
| 37 |
"text": "Your long text to summarize here...",
|
| 38 |
"max_tokens": 256,
|
| 39 |
-
"
|
| 40 |
}
|
| 41 |
```
|
| 42 |
|
|
@@ -48,11 +48,11 @@ Content-Type: application/json
|
|
| 48 |
|
| 49 |
The service uses the following environment variables:
|
| 50 |
|
| 51 |
-
- `OLLAMA_MODEL`: Model to use (default: `
|
| 52 |
-
- `OLLAMA_HOST`: Ollama service host (default: `http://
|
| 53 |
-
- `OLLAMA_TIMEOUT`: Request timeout in seconds (default: `
|
| 54 |
-
- `SERVER_HOST`: Server host (default: `
|
| 55 |
-
- `SERVER_PORT`: Server port (default: `
|
| 56 |
- `LOG_LEVEL`: Logging level (default: `INFO`)
|
| 57 |
|
| 58 |
## π³ Docker Deployment
|
|
@@ -63,7 +63,7 @@ The service uses the following environment variables:
|
|
| 63 |
docker-compose up --build
|
| 64 |
|
| 65 |
# Or run directly
|
| 66 |
-
docker build -
|
| 67 |
docker run -p 7860:7860 summarizer-app
|
| 68 |
```
|
| 69 |
|
|
@@ -72,10 +72,10 @@ This app is configured for deployment on Hugging Face Spaces using Docker SDK.
|
|
| 72 |
|
| 73 |
## π Performance
|
| 74 |
|
| 75 |
-
- **Model**:
|
| 76 |
-
- **Startup time**: ~2
|
| 77 |
-
- **Inference speed**: ~
|
| 78 |
-
- **Memory usage**: ~
|
| 79 |
|
| 80 |
## π οΈ Development
|
| 81 |
|
|
@@ -85,7 +85,7 @@ This app is configured for deployment on Hugging Face Spaces using Docker SDK.
|
|
| 85 |
pip install -r requirements.txt
|
| 86 |
|
| 87 |
# Run locally
|
| 88 |
-
uvicorn app.main:app --host 0.0.0.0 --port
|
| 89 |
```
|
| 90 |
|
| 91 |
### Testing
|
|
@@ -146,7 +146,7 @@ The service includes:
|
|
| 146 |
### Common Issues
|
| 147 |
|
| 148 |
1. **Model not loading**: Check if Ollama is running and model is pulled
|
| 149 |
-
2. **Out of memory**: Ensure sufficient RAM (
|
| 150 |
3. **Slow startup**: Normal on first run due to model download
|
| 151 |
4. **API errors**: Check logs via `/docs` endpoint
|
| 152 |
|
|
|
|
| 11 |
|
| 12 |
# Text Summarizer API
|
| 13 |
|
| 14 |
+
A FastAPI-based text summarization service powered by Ollama and Llama 3.2 1B model.
|
| 15 |
|
| 16 |
## π Features
|
| 17 |
|
|
|
|
| 36 |
{
|
| 37 |
"text": "Your long text to summarize here...",
|
| 38 |
"max_tokens": 256,
|
| 39 |
+
"prompt": "Summarize the following text concisely:"
|
| 40 |
}
|
| 41 |
```
|
| 42 |
|
|
|
|
| 48 |
|
| 49 |
The service uses the following environment variables:
|
| 50 |
|
| 51 |
+
- `OLLAMA_MODEL`: Model to use (default: `llama3.2:1b`)
|
| 52 |
+
- `OLLAMA_HOST`: Ollama service host (default: `http://0.0.0.0:11434`)
|
| 53 |
+
- `OLLAMA_TIMEOUT`: Request timeout in seconds (default: `60`)
|
| 54 |
+
- `SERVER_HOST`: Server host (default: `127.0.0.1`)
|
| 55 |
+
- `SERVER_PORT`: Server port (default: `8000`)
|
| 56 |
- `LOG_LEVEL`: Logging level (default: `INFO`)
|
| 57 |
|
| 58 |
## π³ Docker Deployment
|
|
|
|
| 63 |
docker-compose up --build
|
| 64 |
|
| 65 |
# Or run directly
|
| 66 |
+
docker build -t summarizer-app .
|
| 67 |
docker run -p 7860:7860 summarizer-app
|
| 68 |
```
|
| 69 |
|
|
|
|
| 72 |
|
| 73 |
## π Performance
|
| 74 |
|
| 75 |
+
- **Model**: Llama 3.2 1B (~1GB RAM requirement)
|
| 76 |
+
- **Startup time**: ~1-2 minutes (includes model download)
|
| 77 |
+
- **Inference speed**: ~1-3 seconds per request
|
| 78 |
+
- **Memory usage**: ~2GB RAM
|
| 79 |
|
| 80 |
## π οΈ Development
|
| 81 |
|
|
|
|
| 85 |
pip install -r requirements.txt
|
| 86 |
|
| 87 |
# Run locally
|
| 88 |
+
uvicorn app.main:app --host 0.0.0.0 --port 8000
|
| 89 |
```
|
| 90 |
|
| 91 |
### Testing
|
|
|
|
| 146 |
### Common Issues
|
| 147 |
|
| 148 |
1. **Model not loading**: Check if Ollama is running and model is pulled
|
| 149 |
+
2. **Out of memory**: Ensure sufficient RAM (2GB+) for Llama 3.2 1B
|
| 150 |
3. **Slow startup**: Normal on first run due to model download
|
| 151 |
4. **API errors**: Check logs via `/docs` endpoint
|
| 152 |
|