X-iZhang commited on
Commit
f0b4550
·
verified ·
1 Parent(s): ab78aa1

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +6 -5
app.py CHANGED
@@ -326,11 +326,11 @@ def generate_ccd_description(
326
  print(f"[DEBUG] Tokenizer eos_token: {tokenizer.eos_token}")
327
  print(f"[DEBUG] Tokenizer eos_token_id: {tokenizer.eos_token_id}")
328
  print(f"[DEBUG] Tokenizer padding_side: {getattr(tokenizer, 'padding_side', 'NOT SET')}")
329
- print(f"[DEBUG] Input prompt: {prompt}")
330
  print(f"[DEBUG] Input image path: {current_img}")
331
- print(f"[DEBUG] max_new_tokens: {max_new_tokens}")
332
 
333
- prompt = "Please answer the following question: " + prompt
 
334
 
335
  ccd_output = ccd_eval(
336
  libra_model=model,
@@ -465,11 +465,12 @@ def main():
465
  This demo is running on **CPU-only** mode. A single inference may take **5-10 minutes** depending on the model and parameters.
466
 
467
  **Recommendations for faster inference:**
468
- - Use smaller models (Libra-v1.0-3B is faster than 7B models)
 
469
  - Reduce `Max New Tokens` to 64-128 (default: 128)
470
  - Disable baseline comparison
471
  - For GPU acceleration, please [run the demo locally](https://github.com/X-iZhang/CCD#gradio-web-interface)
472
-
473
  **Note:** If you see "Connection Lost", please wait - the inference is still running. The results will appear when complete.
474
  """)
475
 
 
326
  print(f"[DEBUG] Tokenizer eos_token: {tokenizer.eos_token}")
327
  print(f"[DEBUG] Tokenizer eos_token_id: {tokenizer.eos_token_id}")
328
  print(f"[DEBUG] Tokenizer padding_side: {getattr(tokenizer, 'padding_side', 'NOT SET')}")
 
329
  print(f"[DEBUG] Input image path: {current_img}")
330
+ print(f"[DEBUG] max_new_tokens: {max_new_tokens}")
331
 
332
+ prompt = "You are a helpful AI Assistant. " + prompt
333
+ print(f"[DEBUG] Input prompt: {prompt}")
334
 
335
  ccd_output = ccd_eval(
336
  libra_model=model,
 
465
  This demo is running on **CPU-only** mode. A single inference may take **5-10 minutes** depending on the model and parameters.
466
 
467
  **Recommendations for faster inference:**
468
+ - Use smaller models (Libra-v1.0-3B is faster than 7B models) The model has already been loaded ⏬
469
+ - Please do not attempt to load other models, as this may cause a runtime error: "Workload evicted, storage limit exceeded (50G)"
470
  - Reduce `Max New Tokens` to 64-128 (default: 128)
471
  - Disable baseline comparison
472
  - For GPU acceleration, please [run the demo locally](https://github.com/X-iZhang/CCD#gradio-web-interface)
473
+
474
  **Note:** If you see "Connection Lost", please wait - the inference is still running. The results will appear when complete.
475
  """)
476