ming commited on
Commit
3777705
Β·
1 Parent(s): 84d7a68

docs: Update README.md with accurate model and configuration info

Browse files

- Update default model from mistral:7b to llama3.2:1b
- Fix environment variable defaults (OLLAMA_HOST, SERVER_PORT, OLLAMA_TIMEOUT)
- Update performance metrics (RAM: 7GB→1GB, startup time, inference speed)
- Correct API request schema (remove temperature, add prompt parameter)
- Fix Docker command references
- Update development setup instructions
- Update troubleshooting section with correct RAM requirements

Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -11,7 +11,7 @@ app_port: 7860
11
 
12
  # Text Summarizer API
13
 
14
- A FastAPI-based text summarization service powered by Ollama and Mistral 7B model.
15
 
16
  ## πŸš€ Features
17
 
@@ -36,7 +36,7 @@ Content-Type: application/json
36
  {
37
  "text": "Your long text to summarize here...",
38
  "max_tokens": 256,
39
- "temperature": 0.7
40
  }
41
  ```
42
 
@@ -48,11 +48,11 @@ Content-Type: application/json
48
 
49
  The service uses the following environment variables:
50
 
51
- - `OLLAMA_MODEL`: Model to use (default: `mistral:7b`)
52
- - `OLLAMA_HOST`: Ollama service host (default: `http://localhost:11434`)
53
- - `OLLAMA_TIMEOUT`: Request timeout in seconds (default: `30`)
54
- - `SERVER_HOST`: Server host (default: `0.0.0.0`)
55
- - `SERVER_PORT`: Server port (default: `7860`)
56
  - `LOG_LEVEL`: Logging level (default: `INFO`)
57
 
58
  ## 🐳 Docker Deployment
@@ -63,7 +63,7 @@ The service uses the following environment variables:
63
  docker-compose up --build
64
 
65
  # Or run directly
66
- docker build -f Dockerfile.hf -t summarizer-app .
67
  docker run -p 7860:7860 summarizer-app
68
  ```
69
 
@@ -72,10 +72,10 @@ This app is configured for deployment on Hugging Face Spaces using Docker SDK.
72
 
73
  ## πŸ“Š Performance
74
 
75
- - **Model**: Mistral 7B (7GB RAM requirement)
76
- - **Startup time**: ~2-3 minutes (includes model download)
77
- - **Inference speed**: ~2-5 seconds per request
78
- - **Memory usage**: ~8GB RAM
79
 
80
  ## πŸ› οΈ Development
81
 
@@ -85,7 +85,7 @@ This app is configured for deployment on Hugging Face Spaces using Docker SDK.
85
  pip install -r requirements.txt
86
 
87
  # Run locally
88
- uvicorn app.main:app --host 0.0.0.0 --port 7860
89
  ```
90
 
91
  ### Testing
@@ -146,7 +146,7 @@ The service includes:
146
  ### Common Issues
147
 
148
  1. **Model not loading**: Check if Ollama is running and model is pulled
149
- 2. **Out of memory**: Ensure sufficient RAM (8GB+) for Mistral 7B
150
  3. **Slow startup**: Normal on first run due to model download
151
  4. **API errors**: Check logs via `/docs` endpoint
152
 
 
11
 
12
  # Text Summarizer API
13
 
14
+ A FastAPI-based text summarization service powered by Ollama and Llama 3.2 1B model.
15
 
16
  ## πŸš€ Features
17
 
 
36
  {
37
  "text": "Your long text to summarize here...",
38
  "max_tokens": 256,
39
+ "prompt": "Summarize the following text concisely:"
40
  }
41
  ```
42
 
 
48
 
49
  The service uses the following environment variables:
50
 
51
+ - `OLLAMA_MODEL`: Model to use (default: `llama3.2:1b`)
52
+ - `OLLAMA_HOST`: Ollama service host (default: `http://0.0.0.0:11434`)
53
+ - `OLLAMA_TIMEOUT`: Request timeout in seconds (default: `60`)
54
+ - `SERVER_HOST`: Server host (default: `127.0.0.1`)
55
+ - `SERVER_PORT`: Server port (default: `8000`)
56
  - `LOG_LEVEL`: Logging level (default: `INFO`)
57
 
58
  ## 🐳 Docker Deployment
 
63
  docker-compose up --build
64
 
65
  # Or run directly
66
+ docker build -t summarizer-app .
67
  docker run -p 7860:7860 summarizer-app
68
  ```
69
 
 
72
 
73
  ## πŸ“Š Performance
74
 
75
+ - **Model**: Llama 3.2 1B (~1GB RAM requirement)
76
+ - **Startup time**: ~1-2 minutes (includes model download)
77
+ - **Inference speed**: ~1-3 seconds per request
78
+ - **Memory usage**: ~2GB RAM
79
 
80
  ## πŸ› οΈ Development
81
 
 
85
  pip install -r requirements.txt
86
 
87
  # Run locally
88
+ uvicorn app.main:app --host 0.0.0.0 --port 8000
89
  ```
90
 
91
  ### Testing
 
146
  ### Common Issues
147
 
148
  1. **Model not loading**: Check if Ollama is running and model is pulled
149
+ 2. **Out of memory**: Ensure sufficient RAM (2GB+) for Llama 3.2 1B
150
  3. **Slow startup**: Normal on first run due to model download
151
  4. **API errors**: Check logs via `/docs` endpoint
152