Update app.py
Browse files
app.py
CHANGED
|
@@ -326,11 +326,11 @@ def generate_ccd_description(
|
|
| 326 |
print(f"[DEBUG] Tokenizer eos_token: {tokenizer.eos_token}")
|
| 327 |
print(f"[DEBUG] Tokenizer eos_token_id: {tokenizer.eos_token_id}")
|
| 328 |
print(f"[DEBUG] Tokenizer padding_side: {getattr(tokenizer, 'padding_side', 'NOT SET')}")
|
| 329 |
-
print(f"[DEBUG] Input prompt: {prompt}")
|
| 330 |
print(f"[DEBUG] Input image path: {current_img}")
|
| 331 |
-
print(f"[DEBUG] max_new_tokens: {max_new_tokens}")
|
| 332 |
|
| 333 |
-
prompt = "
|
|
|
|
| 334 |
|
| 335 |
ccd_output = ccd_eval(
|
| 336 |
libra_model=model,
|
|
@@ -465,11 +465,12 @@ def main():
|
|
| 465 |
This demo is running on **CPU-only** mode. A single inference may take **5-10 minutes** depending on the model and parameters.
|
| 466 |
|
| 467 |
**Recommendations for faster inference:**
|
| 468 |
-
- Use smaller models (Libra-v1.0-3B is faster than 7B models)
|
|
|
|
| 469 |
- Reduce `Max New Tokens` to 64-128 (default: 128)
|
| 470 |
- Disable baseline comparison
|
| 471 |
- For GPU acceleration, please [run the demo locally](https://github.com/X-iZhang/CCD#gradio-web-interface)
|
| 472 |
-
|
| 473 |
**Note:** If you see "Connection Lost", please wait - the inference is still running. The results will appear when complete.
|
| 474 |
""")
|
| 475 |
|
|
|
|
| 326 |
print(f"[DEBUG] Tokenizer eos_token: {tokenizer.eos_token}")
|
| 327 |
print(f"[DEBUG] Tokenizer eos_token_id: {tokenizer.eos_token_id}")
|
| 328 |
print(f"[DEBUG] Tokenizer padding_side: {getattr(tokenizer, 'padding_side', 'NOT SET')}")
|
|
|
|
| 329 |
print(f"[DEBUG] Input image path: {current_img}")
|
| 330 |
+
print(f"[DEBUG] max_new_tokens: {max_new_tokens}")
|
| 331 |
|
| 332 |
+
prompt = "You are a helpful AI Assistant. " + prompt
|
| 333 |
+
print(f"[DEBUG] Input prompt: {prompt}")
|
| 334 |
|
| 335 |
ccd_output = ccd_eval(
|
| 336 |
libra_model=model,
|
|
|
|
| 465 |
This demo is running on **CPU-only** mode. A single inference may take **5-10 minutes** depending on the model and parameters.
|
| 466 |
|
| 467 |
**Recommendations for faster inference:**
|
| 468 |
+
- Use smaller models (Libra-v1.0-3B is faster than 7B models) The model has already been loaded ⏬
|
| 469 |
+
- Please do not attempt to load other models, as this may cause a runtime error: "Workload evicted, storage limit exceeded (50G)"
|
| 470 |
- Reduce `Max New Tokens` to 64-128 (default: 128)
|
| 471 |
- Disable baseline comparison
|
| 472 |
- For GPU acceleration, please [run the demo locally](https://github.com/X-iZhang/CCD#gradio-web-interface)
|
| 473 |
+
|
| 474 |
**Note:** If you see "Connection Lost", please wait - the inference is still running. The results will appear when complete.
|
| 475 |
""")
|
| 476 |
|