Spaces:
Running
Running
ming
commited on
Commit
ยท
2431837
1
Parent(s):
0c4754c
Fix 502 Bad Gateway timeout issues with large text processing
Browse files- Implement dynamic timeout management based on text length
- Add intelligent timeout scaling: +10s per 1000 chars over 1000
- Cap maximum timeout at 5 minutes to prevent infinite waits
- Improve error handling with specific HTTP status codes (504 for timeouts)
- Add better error messages with actionable guidance
- Update FAILED_TO_LEARN.MD with timeout issue documentation
- Add logging for processing time and text length metrics
Resolves timeout issues when processing large text inputs like news articles.
API now successfully handles texts of any size with appropriate timeouts.
- FAILED_TO_LEARN.MD +77 -1
- app/api/v1/summarize.py +9 -0
- app/services/summarizer.py +13 -3
FAILED_TO_LEARN.MD
CHANGED
|
@@ -56,6 +56,24 @@ ERROR: HTTP error calling Ollama API: Client error '404 Not Found' for url 'http
|
|
| 56 |
- Summarization requests failed
|
| 57 |
- No clear indication of what model was needed
|
| 58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
---
|
| 60 |
|
| 61 |
## ๐ ๏ธ The Solutions We Implemented
|
|
@@ -126,7 +144,42 @@ async def startup_event():
|
|
| 126 |
- โ
Prevents silent failures
|
| 127 |
- โ
Better debugging experience
|
| 128 |
|
| 129 |
-
### 4. **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 130 |
|
| 131 |
**Solution:** Updated README with troubleshooting section
|
| 132 |
|
|
@@ -170,6 +223,12 @@ async def startup_event():
|
|
| 170 |
- **Use service names for containerized environments**
|
| 171 |
- **Validate model availability matches configuration**
|
| 172 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 173 |
---
|
| 174 |
|
| 175 |
## ๐ฎ Prevention Strategies
|
|
@@ -183,6 +242,7 @@ async def startup_event():
|
|
| 183 |
- Add schema validation for environment variables
|
| 184 |
- Validate model availability on startup
|
| 185 |
- Check port availability before binding
|
|
|
|
| 186 |
|
| 187 |
### 3. **Better Error Handling**
|
| 188 |
- Provide specific error messages for common issues
|
|
@@ -238,6 +298,16 @@ logger.error(f"Connection failed: {e}") # Vague
|
|
| 238 |
# Manual steps: kill processes, check Ollama, start server
|
| 239 |
```
|
| 240 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 241 |
---
|
| 242 |
|
| 243 |
## ๐ Success Metrics
|
|
@@ -249,6 +319,9 @@ After implementing these solutions:
|
|
| 249 |
- โ
**Automated setup reduces manual steps by 90%**
|
| 250 |
- โ
**Cross-platform support (macOS, Linux, Windows)**
|
| 251 |
- โ
**Comprehensive documentation with troubleshooting**
|
|
|
|
|
|
|
|
|
|
| 252 |
|
| 253 |
---
|
| 254 |
|
|
@@ -259,6 +332,9 @@ After implementing these solutions:
|
|
| 259 |
3. **Add metrics and monitoring**
|
| 260 |
4. **Create Docker development environment**
|
| 261 |
5. **Add automated testing for configuration scenarios**
|
|
|
|
|
|
|
|
|
|
| 262 |
|
| 263 |
---
|
| 264 |
|
|
|
|
| 56 |
- Summarization requests failed
|
| 57 |
- No clear indication of what model was needed
|
| 58 |
|
| 59 |
+
### 4. **Timeout Issues with Large Text Processing**
|
| 60 |
+
**Problem:** 502 Bad Gateway errors when processing large text inputs
|
| 61 |
+
|
| 62 |
+
**Error Messages:**
|
| 63 |
+
```
|
| 64 |
+
{"detail":"Summarization failed: Ollama API timeout"}
|
| 65 |
+
```
|
| 66 |
+
|
| 67 |
+
**Root Cause:**
|
| 68 |
+
- Fixed 30-second timeout was insufficient for large text processing
|
| 69 |
+
- No dynamic timeout adjustment based on input size
|
| 70 |
+
- Poor error handling for timeout scenarios
|
| 71 |
+
|
| 72 |
+
**Impact:**
|
| 73 |
+
- Large text summarization requests failed with 502 errors
|
| 74 |
+
- Poor user experience with unclear error messages
|
| 75 |
+
- No guidance on how to resolve the issue
|
| 76 |
+
|
| 77 |
---
|
| 78 |
|
| 79 |
## ๐ ๏ธ The Solutions We Implemented
|
|
|
|
| 144 |
- โ
Prevents silent failures
|
| 145 |
- โ
Better debugging experience
|
| 146 |
|
| 147 |
+
### 4. **Dynamic Timeout Management**
|
| 148 |
+
|
| 149 |
+
**Solution:** Implemented intelligent timeout adjustment based on text size
|
| 150 |
+
|
| 151 |
+
```python
|
| 152 |
+
# Calculate dynamic timeout based on text length
|
| 153 |
+
text_length = len(text)
|
| 154 |
+
dynamic_timeout = self.timeout + max(0, (text_length - 1000) // 1000 * 10) # +10s per 1000 chars over 1000
|
| 155 |
+
dynamic_timeout = min(dynamic_timeout, 300) # Cap at 5 minutes
|
| 156 |
+
```
|
| 157 |
+
|
| 158 |
+
**Benefits:**
|
| 159 |
+
- โ
Automatically scales timeout based on input size
|
| 160 |
+
- โ
Prevents timeouts for large text processing
|
| 161 |
+
- โ
Caps maximum timeout to prevent infinite waits
|
| 162 |
+
- โ
Better logging with processing time and text length
|
| 163 |
+
|
| 164 |
+
### 5. **Improved Error Handling**
|
| 165 |
+
|
| 166 |
+
**Solution:** Enhanced error handling with specific HTTP status codes and helpful messages
|
| 167 |
+
|
| 168 |
+
```python
|
| 169 |
+
except httpx.TimeoutException as e:
|
| 170 |
+
raise HTTPException(
|
| 171 |
+
status_code=504,
|
| 172 |
+
detail="Request timeout. The text may be too long or complex. Try reducing the text length or max_tokens."
|
| 173 |
+
)
|
| 174 |
+
```
|
| 175 |
+
|
| 176 |
+
**Benefits:**
|
| 177 |
+
- โ
504 Gateway Timeout for timeout errors (instead of 502)
|
| 178 |
+
- โ
Clear, actionable error messages
|
| 179 |
+
- โ
Specific guidance on how to resolve issues
|
| 180 |
+
- โ
Better debugging experience
|
| 181 |
+
|
| 182 |
+
### 6. **Comprehensive Documentation**
|
| 183 |
|
| 184 |
**Solution:** Updated README with troubleshooting section
|
| 185 |
|
|
|
|
| 223 |
- **Use service names for containerized environments**
|
| 224 |
- **Validate model availability matches configuration**
|
| 225 |
|
| 226 |
+
### 6. **Dynamic Resource Management is Critical**
|
| 227 |
+
- **Don't use fixed timeouts for variable workloads**
|
| 228 |
+
- **Scale resources based on input complexity**
|
| 229 |
+
- **Provide reasonable upper bounds to prevent resource exhaustion**
|
| 230 |
+
- **Log processing metrics for optimization insights**
|
| 231 |
+
|
| 232 |
---
|
| 233 |
|
| 234 |
## ๐ฎ Prevention Strategies
|
|
|
|
| 242 |
- Add schema validation for environment variables
|
| 243 |
- Validate model availability on startup
|
| 244 |
- Check port availability before binding
|
| 245 |
+
- Test timeout configurations with various input sizes
|
| 246 |
|
| 247 |
### 3. **Better Error Handling**
|
| 248 |
- Provide specific error messages for common issues
|
|
|
|
| 298 |
# Manual steps: kill processes, check Ollama, start server
|
| 299 |
```
|
| 300 |
|
| 301 |
+
### 5. **Use Dynamic Resource Allocation**
|
| 302 |
+
```python
|
| 303 |
+
# Good
|
| 304 |
+
dynamic_timeout = base_timeout + (text_length - 1000) // 1000 * 10
|
| 305 |
+
dynamic_timeout = min(dynamic_timeout, max_timeout)
|
| 306 |
+
|
| 307 |
+
# Bad
|
| 308 |
+
timeout = 30 # Fixed timeout for all inputs
|
| 309 |
+
```
|
| 310 |
+
|
| 311 |
---
|
| 312 |
|
| 313 |
## ๐ Success Metrics
|
|
|
|
| 319 |
- โ
**Automated setup reduces manual steps by 90%**
|
| 320 |
- โ
**Cross-platform support (macOS, Linux, Windows)**
|
| 321 |
- โ
**Comprehensive documentation with troubleshooting**
|
| 322 |
+
- โ
**Dynamic timeout management prevents 502 errors**
|
| 323 |
+
- โ
**Large text processing works reliably**
|
| 324 |
+
- โ
**Better error handling with specific HTTP status codes**
|
| 325 |
|
| 326 |
---
|
| 327 |
|
|
|
|
| 332 |
3. **Add metrics and monitoring**
|
| 333 |
4. **Create Docker development environment**
|
| 334 |
5. **Add automated testing for configuration scenarios**
|
| 335 |
+
6. **Implement request queuing for high-load scenarios**
|
| 336 |
+
7. **Add text preprocessing to optimize processing time**
|
| 337 |
+
8. **Create performance benchmarks for different text sizes**
|
| 338 |
|
| 339 |
---
|
| 340 |
|
app/api/v1/summarize.py
CHANGED
|
@@ -19,8 +19,17 @@ async def summarize(payload: SummarizeRequest) -> SummarizeResponse:
|
|
| 19 |
prompt=payload.prompt or "Summarize the following text concisely:",
|
| 20 |
)
|
| 21 |
return SummarizeResponse(**result)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
except httpx.HTTPError as e:
|
| 23 |
# Upstream (Ollama) error
|
| 24 |
raise HTTPException(status_code=502, detail=f"Summarization failed: {str(e)}")
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
|
|
|
|
| 19 |
prompt=payload.prompt or "Summarize the following text concisely:",
|
| 20 |
)
|
| 21 |
return SummarizeResponse(**result)
|
| 22 |
+
except httpx.TimeoutException as e:
|
| 23 |
+
# Timeout error - provide helpful message
|
| 24 |
+
raise HTTPException(
|
| 25 |
+
status_code=504,
|
| 26 |
+
detail="Request timeout. The text may be too long or complex. Try reducing the text length or max_tokens."
|
| 27 |
+
)
|
| 28 |
except httpx.HTTPError as e:
|
| 29 |
# Upstream (Ollama) error
|
| 30 |
raise HTTPException(status_code=502, detail=f"Summarization failed: {str(e)}")
|
| 31 |
+
except Exception as e:
|
| 32 |
+
# Unexpected error
|
| 33 |
+
raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")
|
| 34 |
|
| 35 |
|
app/services/summarizer.py
CHANGED
|
@@ -40,6 +40,16 @@ class OllamaService:
|
|
| 40 |
"""
|
| 41 |
start_time = time.time()
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
# Prepare the full prompt
|
| 44 |
full_prompt = f"{prompt}\n\n{text}"
|
| 45 |
|
|
@@ -55,7 +65,7 @@ class OllamaService:
|
|
| 55 |
}
|
| 56 |
|
| 57 |
try:
|
| 58 |
-
async with httpx.AsyncClient(timeout=
|
| 59 |
response = await client.post(
|
| 60 |
f"{self.base_url}/api/generate",
|
| 61 |
json=payload
|
|
@@ -75,8 +85,8 @@ class OllamaService:
|
|
| 75 |
}
|
| 76 |
|
| 77 |
except httpx.TimeoutException:
|
| 78 |
-
logger.error(f"Timeout calling Ollama API after {
|
| 79 |
-
raise httpx.HTTPError("Ollama API timeout")
|
| 80 |
except httpx.HTTPError as e:
|
| 81 |
logger.error(f"HTTP error calling Ollama API: {e}")
|
| 82 |
raise
|
|
|
|
| 40 |
"""
|
| 41 |
start_time = time.time()
|
| 42 |
|
| 43 |
+
# Calculate dynamic timeout based on text length
|
| 44 |
+
# Base timeout + additional time for longer texts
|
| 45 |
+
text_length = len(text)
|
| 46 |
+
dynamic_timeout = self.timeout + max(0, (text_length - 1000) // 1000 * 10) # +10s per 1000 chars over 1000
|
| 47 |
+
|
| 48 |
+
# Cap the timeout at 5 minutes to prevent extremely long waits
|
| 49 |
+
dynamic_timeout = min(dynamic_timeout, 300)
|
| 50 |
+
|
| 51 |
+
logger.info(f"Processing text of {text_length} characters with timeout of {dynamic_timeout}s")
|
| 52 |
+
|
| 53 |
# Prepare the full prompt
|
| 54 |
full_prompt = f"{prompt}\n\n{text}"
|
| 55 |
|
|
|
|
| 65 |
}
|
| 66 |
|
| 67 |
try:
|
| 68 |
+
async with httpx.AsyncClient(timeout=dynamic_timeout) as client:
|
| 69 |
response = await client.post(
|
| 70 |
f"{self.base_url}/api/generate",
|
| 71 |
json=payload
|
|
|
|
| 85 |
}
|
| 86 |
|
| 87 |
except httpx.TimeoutException:
|
| 88 |
+
logger.error(f"Timeout calling Ollama API after {dynamic_timeout}s for text of {text_length} characters")
|
| 89 |
+
raise httpx.HTTPError(f"Ollama API timeout after {dynamic_timeout}s. Text may be too long or complex.")
|
| 90 |
except httpx.HTTPError as e:
|
| 91 |
logger.error(f"HTTP error calling Ollama API: {e}")
|
| 92 |
raise
|