Spaces:

X-iZhang
/

CCD

Running

X-iZhang commited on Oct 10

Commit

f0b4550

verified ·

1 Parent(s): ab78aa1

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -326,11 +326,11 @@ def generate_ccd_description(
         print(f"[DEBUG] Tokenizer eos_token: {tokenizer.eos_token}")
         print(f"[DEBUG] Tokenizer eos_token_id: {tokenizer.eos_token_id}")
         print(f"[DEBUG] Tokenizer padding_side: {getattr(tokenizer, 'padding_side', 'NOT SET')}")
-        print(f"[DEBUG] Input prompt: {prompt}")
         print(f"[DEBUG] Input image path: {current_img}")
-        print(f"[DEBUG] max_new_tokens: {max_new_tokens}")
-        prompt = "Please answer the following question: " + prompt
         ccd_output = ccd_eval(
             libra_model=model,
@@ -465,11 +465,12 @@ def main():
         This demo is running on **CPU-only** mode. A single inference may take **5-10 minutes** depending on the model and parameters.
         **Recommendations for faster inference:**
-        - Use smaller models (Libra-v1.0-3B is faster than 7B models)
         - Reduce `Max New Tokens` to 64-128 (default: 128)
         - Disable baseline comparison
         - For GPU acceleration, please [run the demo locally](https://github.com/X-iZhang/CCD#gradio-web-interface)
         **Note:** If you see "Connection Lost", please wait - the inference is still running. The results will appear when complete.
         """)

         print(f"[DEBUG] Tokenizer eos_token: {tokenizer.eos_token}")
         print(f"[DEBUG] Tokenizer eos_token_id: {tokenizer.eos_token_id}")
         print(f"[DEBUG] Tokenizer padding_side: {getattr(tokenizer, 'padding_side', 'NOT SET')}")
         print(f"[DEBUG] Input image path: {current_img}")
+        print(f"[DEBUG] max_new_tokens: {max_new_tokens}")
+        prompt = "You are a helpful AI Assistant. " + prompt
+        print(f"[DEBUG] Input prompt: {prompt}")
         ccd_output = ccd_eval(
             libra_model=model,
         This demo is running on **CPU-only** mode. A single inference may take **5-10 minutes** depending on the model and parameters.
         **Recommendations for faster inference:**
+        - Use smaller models (Libra-v1.0-3B is faster than 7B models) The model has already been loaded ⏬
+        - Please do not attempt to load other models, as this may cause a runtime error: "Workload evicted, storage limit exceeded (50G)"
         - Reduce `Max New Tokens` to 64-128 (default: 128)
         - Disable baseline comparison
         - For GPU acceleration, please [run the demo locally](https://github.com/X-iZhang/CCD#gradio-web-interface)
         **Note:** If you see "Connection Lost", please wait - the inference is still running. The results will appear when complete.
         """)