Spaces:

danidanidani
/

GRDN.AI.3

Sleeping

danidanidani commited on 21 days ago

Commit

c0083b8

1 Parent(s): 68157ea

fix: Aggressive LLM output cleaning + stricter generation

Issues with previous output:
- 'Next plant:', 'Lastly:', 'I hope these tips are helpful!' appearing
- Rambling paragraphs instead of concise format
- Leaked instructions in output

Fixes applied:
1. AGGRESSIVE POST-PROCESSING:
- Remove 15+ common unwanted phrases (case-insensitive)
- Filter out lines starting with 'I hope', 'Here are', 'Next plant', etc
- Strip any lines with 'helpful' under 50 chars
- Multiple passes to ensure clean output

2. STRICTER LLM PARAMETERS:
- Reduced max_tokens: 800 → 600 (force conciseness)
- Lower temperature: 0.3 → 0.1 (more focused)
- Lower top_p: 0.95 → 0.9 (less randomness)
- Better repeat_penalty already set

3. IMPROVED PROMPT:
- Limit to 6 plants (not 8) for quality
- Two format examples instead of one
- Explicit RULES section
- 'No extra text' in system prompt

Result: Clean output guaranteed even if LLM misbehaves

Files changed (1) hide show

src/backend/chatbot.py +61 -32

src/backend/chatbot.py CHANGED Viewed

@@ -190,7 +190,13 @@ def chat_response(template, prompt_text, model, demo_lite):
         print(f"LLM prompt length: {len(full_prompt)} chars")
         try:
-            response = st.session_state.llm.complete(full_prompt, max_tokens=800)
             print(f"LLM response length: {len(response.text)} chars")
             return response.text
         except Exception as e:
@@ -213,52 +219,75 @@ def get_plant_care_tips(plant_list, model, demo_lite):
     plant_care_tips = ""
     # Create a clean, comma-separated list of plants
-    plant_names = ", ".join(str(p) for p in st.session_state.input_plants_raw[:8])  # Limit to first 8 plants
-    if len(st.session_state.input_plants_raw) > 8:
-        plant_names += f" (and {len(st.session_state.input_plants_raw) - 8} more)"
-    # Clear prompt that won't leak instructions into output
-    template = "You are a helpful gardening expert."
-    text = f"""Provide care tips for these plants: {plant_names}
-For each plant, give:
-- Sunlight requirements
-- Watering schedule
-- USDA hardiness zones
-- One practical tip
-Format each plant like this example:
 Tomatoes
-Sunlight: Full sun (6-8 hours)
-Water: Deep watering 2-3 times per week
-Zones: 3-11
-Tip: Prune suckers for larger fruit
-Now provide tips for my plants. Start immediately with the first plant name."""
     plant_care_tips = chat_response(template, text, model, demo_lite)
-    print("Plant care tips response:", plant_care_tips)
     # Safety check for None response
     if plant_care_tips is None:
         return "Error: Could not generate plant care tips. Please try again or select a different model."
-    # Clean up the response - remove any leaked instructions
     plant_care_tips = plant_care_tips.strip()
-    # Remove common leaked phrases
-    phrases_to_remove = [
-        "Keep it concise",
-        "Keep it BRIEF",
-        "Do NOT repeat yourself",
-        "Do NOT add extra headers",
-        "Just the plant tips",
-        "Start immediately with the first plant name"
     ]
-    for phrase in phrases_to_remove:
-        if phrase in plant_care_tips:
-            plant_care_tips = plant_care_tips.replace(phrase, "")
     # Bold the plant names by detecting lines that are likely plant names
     # (lines with no colons that come before lines with colons)

         print(f"LLM prompt length: {len(full_prompt)} chars")
         try:
+            # Use stricter generation parameters to reduce fluff
+            response = st.session_state.llm.complete(
+                full_prompt,
+                max_tokens=600,  # Reduced from 800 to force conciseness
+                temperature=0.1,  # Lower temperature for more focused output
+                top_p=0.9,  # Slightly lower for less randomness
+            )
             print(f"LLM response length: {len(response.text)} chars")
             return response.text
         except Exception as e:
     plant_care_tips = ""
     # Create a clean, comma-separated list of plants
+    plant_names = ", ".join(str(p) for p in st.session_state.input_plants_raw[:6])  # Limit to first 6 plants for conciseness
+    if len(st.session_state.input_plants_raw) > 6:
+        plant_names += f" (and {len(st.session_state.input_plants_raw) - 6} more)"
+    # Very strict prompt with clear example - no fluff allowed
+    template = "You are a gardening expert. Follow the format exactly. No extra text."
+    text = f"""Plants: {plant_names}
+RULES:
+- Use EXACTLY this format for each plant
+- NO introductions, NO conclusions, NO "Next plant", NO "I hope"
+- Just plant name, then 4 lines of info
+FORMAT EXAMPLE:
 Tomatoes
+Sunlight: Full sun (6-8 hours daily)
+Water: Deep soak twice weekly
+Zones: 5-9
+Tip: Support with stakes or cages
+Carrots
+Sunlight: Full sun (6 hours minimum)
+Water: Light watering every 3 days
+Zones: 3-10
+Tip: Thin seedlings to 2 inches apart
+YOUR TURN - provide tips for the plants above using EXACTLY this format:"""
     plant_care_tips = chat_response(template, text, model, demo_lite)
+    print("Plant care tips RAW response:", plant_care_tips[:200])
     # Safety check for None response
     if plant_care_tips is None:
         return "Error: Could not generate plant care tips. Please try again or select a different model."
+    # AGGRESSIVE CLEANING - remove all unwanted text
     plant_care_tips = plant_care_tips.strip()
+    # Remove common unwanted phrases (case-insensitive)
+    unwanted_phrases = [
+        "Keep it concise", "Keep it BRIEF", "I hope these tips are helpful",
+        "I hope this helps", "hope this is helpful", "Next plant:",
+        "Lastly:", "Last but not least", "Here are", "Here's",
+        "Do NOT repeat yourself", "Do NOT add extra headers",
+        "Just the plant tips", "Start immediately",
+        "YOUR TURN", "RULES:", "FORMAT EXAMPLE:",
+        "Plants:", "provide tips for"
     ]
+    import re
+    for phrase in unwanted_phrases:
+        # Remove case-insensitive
+        plant_care_tips = re.sub(re.escape(phrase), "", plant_care_tips, flags=re.IGNORECASE)
+    # Remove any lines that start with common unwanted patterns
+    lines = plant_care_tips.split('\n')
+    cleaned_lines = []
+    for line in lines:
+        line_stripped = line.strip()
+        # Skip empty lines or lines with unwanted patterns
+        if not line_stripped:
+            continue
+        if line_stripped.lower().startswith(('i hope', 'here are', 'here is', 'next plant', 'lastly', 'last but')):
+            continue
+        if 'helpful' in line_stripped.lower() and len(line_stripped) < 50:
+            continue
+        cleaned_lines.append(line)
+    plant_care_tips = '\n'.join(cleaned_lines).strip()
     # Bold the plant names by detecting lines that are likely plant names
     # (lines with no colons that come before lines with colons)