Spaces:

Tonic
/

Petite-LLM-3

Running on Zero

App Files Files Community

Tonic commited on Jul 29

Commit

19b19f0

1 Parent(s): 6f9d970

tries to download the model at build time

Browse files

Files changed (11) hide show

.gitignore +82 -0
README.md +178 -1
app.py +303 -0
build.py +113 -0
config.yaml +51 -0
deploy.py +181 -0
download_model.py +118 -0
download_model_advanced.py +206 -0
requirements.txt +11 -0
test_app.py +212 -0
test_model_loading.py +81 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,82 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyTorch
+*.pth
+*.pt
+# Model files (if downloaded locally)
+models/
+checkpoints/
+int4/
+*.safetensors
+*.bin
+*.json
+!requirements.txt
+!config.yaml
+!app.py
+!README.md
+!test_app.py
+!deploy.py
+# Hugging Face cache
+.cache/
+.huggingface/
+transformers_cache/
+# Logs
+*.log
+logs/
+# Environment
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+# OS
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
+# Gradio
+gradio_cached_examples/
+flagged/
+# Temporary files
+*.tmp
+*.temp

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 title: Petite LLM 3
-emoji: 🦀
 colorFrom: green
 colorTo: purple
 sdk: gradio
@@ -11,4 +11,181 @@ license: mit
 short_description: Smollm3 for French Understanding
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Petite LLM 3
+emoji: 💃🏻
 colorFrom: green
 colorTo: purple
 sdk: gradio
 short_description: Smollm3 for French Understanding
 ---
+# 🤖 Petite Elle L'Aime 3 - Chat Interface
+A complete Gradio application for the [Petite Elle L'Aime 3](https://huggingface.co/Tonic/petite-elle-L-aime-3-sft) model, featuring the int4 quantized version for efficient CPU deployment.
+## 🚀 Features
+- **Multilingual Support**: English, French, Italian, Portuguese, Chinese, Arabic
+- **Int4 Quantization**: Optimized for CPU deployment with ~50% memory reduction
+- **Interactive Chat Interface**: Real-time conversation with the model
+- **Customizable System Prompt**: Define the assistant's personality and behavior
+- **Thinking Mode**: Enable reasoning mode with thinking tags
+- **Responsive Design**: Modern UI following the reference layout
+- **Chat Template Integration**: Proper Jinja template formatting
+- **Automatic Model Download**: Downloads int4 model at build time
+## 📋 Model Information
+- **Base Model**: SmolLM3-3B
+- **Parameters**: ~3B
+- **Context Length**: 128k
+- **Quantization**: int4 (CPU optimized)
+- **Memory Reduction**: ~50%
+- **Languages**: English, French, Italian, Portuguese, Chinese, Arabic
+## 🛠️ Installation
+1. Clone this repository:
+```bash
+git clone <repository-url>
+cd Petite-LLM-3
+```
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+## 🚀 Usage
+### Local Development
+Run the application locally:
+```bash
+python app.py
+```
+The application will be available at `http://localhost:7860`
+### Hugging Face Spaces
+This application is configured for deployment on Hugging Face Spaces with automatic model download:
+1. **Build Process**: The `build.py` script automatically downloads the int4 model during Space build
+2. **Model Loading**: Uses local model files when available, falls back to Hugging Face download
+3. **Caching**: Model files are cached for faster subsequent runs
+## 🎛️ Interface Features
+### Layout Structure
+The interface follows the reference layout with:
+- **Title Section**: Main heading and description
+- **Information Panels**: Features and model information
+- **Input Section**: Context and user input areas
+- **Advanced Settings**: Collapsible parameter controls
+- **Chat Interface**: Real-time conversation display
+### System Prompt
+- **Default**: "Tu es TonicIA, un assistant francophone rigoureux et bienveillant."
+- **Editable**: Users can customize the system prompt to define the assistant's personality
+- **Real-time**: Changes take effect immediately for new conversations
+### Generation Parameters
+- **Max Length**: Maximum number of tokens to generate (64-2048)
+- **Temperature**: Controls randomness in generation (0.01-1.0)
+- **Top-p**: Nucleus sampling parameter (0.1-1.0)
+- **Enable Thinking**: Enable reasoning mode with thinking tags
+- **Advanced Settings**: Collapsible panel for fine-tuning
+## 🔧 Technical Details
+### Model Loading Strategy
+The application uses a smart loading strategy:
+1. **Local Check**: First checks if int4 model files exist locally
+2. **Local Loading**: If available, loads from `./int4` folder
+3. **Fallback Download**: If not available, downloads from Hugging Face
+4. **Tokenizer**: Always uses main repo for chat template and configuration
+### Build Process
+For Hugging Face Spaces deployment:
+1. **Build Script**: `build.py` runs during Space build
+2. **Model Download**: `download_model.py` downloads int4 model files
+3. **Local Storage**: Model files stored in `./int4` directory
+4. **Fast Loading**: Subsequent runs use local files
+### Chat Template Integration
+The application uses the custom chat template from the model, which supports:
+- System prompt integration
+- User and assistant message formatting
+- Thinking mode with `<think>` tags
+- Proper conversation flow management
+### Memory Optimization
+- Uses int4 quantization for reduced memory footprint
+- Automatic device detection (CUDA/CPU)
+- Efficient tokenization and generation
+## 📝 Example Usage
+1. **Basic Conversation**:
+   - Add context in the system prompt area
+   - Type your message in the user input box
+   - Click the generate button to start chatting
+2. **Customizing System Prompt**:
+   - Edit the context in the dedicated text area
+   - Changes apply to new messages immediately
+   - Example: "Tu es un expert en programmation Python."
+3. **Advanced Settings**:
+   - Check the "Advanced Settings" checkbox
+   - Adjust generation parameters as needed
+   - Enable/disable thinking mode
+4. **Real-time Chat**:
+   - Messages appear in the chat interface
+   - Conversation history is maintained
+   - Responses are generated using the model's chat template
+## 🐛 Troubleshooting
+### Common Issues
+1. **Model Loading Errors**:
+   - Ensure you have sufficient RAM (8GB+ recommended)
+   - Check your internet connection for model download
+   - Verify all dependencies are installed
+2. **Generation Errors**:
+   - Try reducing the "Max Length" parameter
+   - Adjust temperature and top-p values
+   - Check the console for detailed error messages
+3. **Performance Issues**:
+   - The int4 model is optimized for CPU but may be slower than GPU versions
+   - Consider using a machine with more RAM for better performance
+4. **System Prompt Issues**:
+   - Ensure the system prompt is not too long (max 1000 characters)
+   - Check that the prompt follows the expected format
+5. **Build Process Issues**:
+   - Check that `download_model.py` runs successfully
+   - Verify that model files are downloaded to `./int4` directory
+   - Ensure sufficient storage space for model files
+## 📄 License
+This project is licensed under the MIT License. The underlying model is licensed under Apache 2.0.
+## 🙏 Acknowledgments
+- **Model**: [Tonic/petite-elle-L-aime-3-sft](https://huggingface.co/Tonic/petite-elle-L-aime-3-sft)
+- **Base Model**: SmolLM3-3B by HuggingFaceTB
+- **Training Data**: legmlai/openhermes-fr
+- **Framework**: Gradio, Transformers, PyTorch
+- **Layout Reference**: [Tonic/Nvidia-OpenReasoning](https://huggingface.co/spaces/Tonic/Nvidia-OpenReasoning)
+## 🔗 Links
+- [Model on Hugging Face](https://huggingface.co/Tonic/petite-elle-L-aime-3-sft)
+- [Chat Template](https://huggingface.co/Tonic/petite-elle-L-aime-3-sft/blob/main/chat_template.jinja)
+- [Original App Reference](https://huggingface.co/spaces/Tonic/Nvidia-OpenReasoning)
+---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

app.py ADDED Viewed

	@@ -0,0 +1,303 @@

+import gradio as gr
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import re
+import json
+from typing import List, Dict, Any, Optional
+import logging
+import spaces
+import os
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Model configuration
+MAIN_MODEL_ID = "Tonic/petite-elle-L-aime-3-sft"  # Main repo for config and chat template
+INT4_MODEL_ID = "Tonic/petite-elle-L-aime-3-sft/int4"  # Int4 quantized model
+LOCAL_MODEL_PATH = "./int4"  # Local int4 weights
+DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
+# Global variables for model and tokenizer
+model = None
+tokenizer = None
+# Default system prompt
+DEFAULT_SYSTEM_PROMPT = "Tu es TonicIA, un assistant francophone rigoureux et bienveillant."
+# Title and description content
+title = "# 🤖 Petite Elle L'Aime 3 - Chat Interface"
+description = "A fine-tuned version of SmolLM3-3B optimized for French and multilingual conversations. This is the int4 quantized version for efficient CPU deployment."
+presentation1 = """
+### 🎯 Features
+- **Multilingual Support**: English, French, Italian, Portuguese, Chinese, Arabic
+- **Int4 Quantization**: Optimized for CPU deployment with ~50% memory reduction
+- **Interactive Chat Interface**: Real-time conversation with the model
+- **Customizable System Prompt**: Define the assistant's personality and behavior
+- **Thinking Mode**: Enable reasoning mode with thinking tags
+"""
+presentation2 = """
+### 📋 Model Information
+- **Base Model**: SmolLM3-3B
+- **Parameters**: ~3B
+- **Context Length**: 128k
+- **Languages**: English, French, Italian, Portuguese, Chinese, Arabic
+- **Device**: CPU optimized
+- **Quantization**: int4
+"""
+joinus = """
+### 🚀 Quick Start
+1. Add context in the system prompt
+2. Type your message
+3. Click generate to start chatting
+4. Use advanced settings for fine-tuning
+"""
+def check_local_model():
+    """Check if local int4 model files exist"""
+    required_files = [
+        "config.json",
+        "pytorch_model.bin",
+        "tokenizer.json",
+        "tokenizer_config.json"
+    ]
+    for file in required_files:
+        file_path = os.path.join(LOCAL_MODEL_PATH, file)
+        if not os.path.exists(file_path):
+            logger.warning(f"Missing required file: {file_path}")
+            return False
+    logger.info("All required model files found locally")
+    return True
+def load_model():
+    """Load the model and tokenizer"""
+    global model, tokenizer
+    try:
+        # Check if local model exists (downloaded during build)
+        if check_local_model():
+            logger.info(f"Loading tokenizer from {LOCAL_MODEL_PATH}")
+            tokenizer = AutoTokenizer.from_pretrained(LOCAL_MODEL_PATH)
+            logger.info(f"Loading int4 model from {LOCAL_MODEL_PATH}")
+            model = AutoModelForCausalLM.from_pretrained(
+                LOCAL_MODEL_PATH,
+                device_map="auto" if DEVICE == "cuda" else "cpu",
+                torch_dtype=torch.bfloat16,
+                trust_remote_code=True
+            )
+        else:
+            logger.info(f"Local model not found, loading from {MAIN_MODEL_ID}")
+            # Load tokenizer from main repo (for chat template and config)
+            tokenizer = AutoTokenizer.from_pretrained(MAIN_MODEL_ID)
+            logger.info(f"Loading int4 model from {INT4_MODEL_ID}")
+            # Load model with int4 quantization from Hugging Face
+            model = AutoModelForCausalLM.from_pretrained(
+                INT4_MODEL_ID,
+                device_map="auto" if DEVICE == "cuda" else "cpu",
+                torch_dtype=torch.bfloat16,
+                trust_remote_code=True
+            )
+        # Set pad token if not present
+        if tokenizer.pad_token_id is None:
+            tokenizer.pad_token_id = tokenizer.eos_token_id
+        logger.info("Model loaded successfully")
+        return True
+    except Exception as e:
+        logger.error(f"Error loading model: {e}")
+        return False
+def create_prompt(system_message, user_message, enable_thinking=True):
+    """Create prompt using the model's chat template"""
+    try:
+        # Prepare messages for the template
+        formatted_messages = []
+        # Add system message if provided
+        if system_message and system_message.strip():
+            formatted_messages.append({"role": "system", "content": system_message})
+        # Add user message
+        formatted_messages.append({"role": "user", "content": user_message})
+        # Apply the chat template
+        prompt = tokenizer.apply_chat_template(
+            formatted_messages,
+            tokenize=False,
+            add_generation_prompt=True,
+            enable_thinking=enable_thinking
+        )
+        # Add  /no_think to the end of prompt when thinking is disabled
+        if not enable_thinking:
+            prompt += "  /no_think"
+        return prompt
+    except Exception as e:
+        logger.error(f"Error creating prompt: {e}")
+        return ""
+@spaces.GPU(duration=94)
+def generate_response(message, history, system_message, max_tokens, temperature, top_p, do_sample, enable_thinking=True):
+    """Generate response using the model"""
+    global model, tokenizer
+    if model is None or tokenizer is None:
+        return "Error: Model not loaded. Please wait for the model to load."
+    try:
+        # Create prompt using chat template
+        full_prompt = create_prompt(system_message, message, enable_thinking)
+        if not full_prompt:
+            return "Error: Failed to create prompt."
+        # Tokenize the input
+        inputs = tokenizer(full_prompt, return_tensors="pt", padding=True, truncation=True)
+        # Move to device
+        if DEVICE == "cuda":
+            inputs = {k: v.cuda() for k, v in inputs.items()}
+        # Generate response
+        with torch.no_grad():
+            output_ids = model.generate(
+                inputs['input_ids'],
+                max_new_tokens=max_tokens,
+                temperature=temperature,
+                top_p=top_p,
+                do_sample=do_sample,
+                attention_mask=inputs['attention_mask'],
+                pad_token_id=tokenizer.eos_token_id,
+                eos_token_id=tokenizer.eos_token_id
+            )
+        # Decode the response
+        response = tokenizer.decode(output_ids[0], skip_special_tokens=True)
+        # Extract only the new response (remove the input prompt)
+        assistant_response = response[len(full_prompt):].strip()
+        # Clean up the response - only remove special tokens, preserve thinking tags when enabled
+        assistant_response = re.sub(r'<\|im_start\|>.*?<\|im_end\|>', '', assistant_response, flags=re.DOTALL)
+        # Only remove thinking tags if thinking mode is disabled
+        if not enable_thinking:
+            assistant_response = re.sub(r'<think>.*?</think>', '', assistant_response, flags=re.DOTALL)
+        assistant_response = assistant_response.strip()
+        return assistant_response
+    except Exception as e:
+        logger.error(f"Error generating response: {e}")
+        return f"Error generating response: {str(e)}"
+def user(user_message, history):
+    """Add user message to history"""
+    return "", history + [[user_message, None]]
+def bot(history, system_prompt, max_length, temperature, top_p, advanced_checkbox, enable_thinking):
+    """Generate bot response"""
+    user_message = history[-1][0]
+    do_sample = advanced_checkbox
+    bot_message = generate_response(user_message, history, system_prompt, max_length, temperature, top_p, do_sample, enable_thinking)
+    history[-1][1] = bot_message
+    return history
+# Load model on startup
+logger.info("Starting model loading process...")
+load_model()
+# Create Gradio interface
+with gr.Blocks() as demo:
+    with gr.Row():
+        gr.Markdown(title)
+    with gr.Row():
+        gr.Markdown(description)
+    with gr.Row():
+        with gr.Column(scale=1):
+            with gr.Group():
+                gr.Markdown(presentation1)
+        with gr.Column(scale=1):
+            with gr.Group():
+                gr.Markdown(presentation2)
+    with gr.Row():
+        with gr.Column(scale=1):
+            with gr.Group():
+                gr.Markdown(joinus)
+        with gr.Column(scale=1):
+            pass  # Empty column for balance
+    with gr.Row():
+        with gr.Column(scale=2):
+            system_prompt = gr.TextArea(
+                label="📑 Context",
+                placeholder="Tu es TonicIA, un assistant francophone rigoureux et bienveillant.",
+                lines=5,
+                value=DEFAULT_SYSTEM_PROMPT
+            )
+            user_input = gr.TextArea(
+                label="🤷🏻‍♂️ User Input",
+                placeholder="Hi there my name is Tonic!",
+                lines=2
+            )
+            advanced_checkbox = gr.Checkbox(label="🧪 Advanced Settings", value=False)
+            with gr.Column(visible=False) as advanced_settings:
+                max_length = gr.Slider(
+                    label="📏 Max Length",
+                    minimum=64,
+                    maximum=2048,
+                    value=512,
+                    step=64
+                )
+                temperature = gr.Slider(
+                    label="🌡️ Temperature",
+                    minimum=0.01,
+                    maximum=1.0,
+                    value=0.7,
+                    step=0.01
+                )
+                top_p = gr.Slider(
+                    label="⚛️ Top-p (Nucleus Sampling)",
+                    minimum=0.1,
+                    maximum=1.0,
+                    value=0.9,
+                    step=0.01
+                )
+                enable_thinking = gr.Checkbox(label="Enable Thinking Mode", value=True)
+            generate_button = gr.Button(value="🤖 Petite Elle L'Aime 3")
+        with gr.Column(scale=2):
+            chatbot = gr.Chatbot(label="🤖 Petite Elle L'Aime 3")
+    generate_button.click(
+        user,
+        [user_input, chatbot],
+        [user_input, chatbot],
+        queue=False
+    ).then(
+        bot,
+        [chatbot, system_prompt, max_length, temperature, top_p, advanced_checkbox, enable_thinking],
+        chatbot
+    )
+    advanced_checkbox.change(
+        fn=lambda x: gr.update(visible=x),
+        inputs=[advanced_checkbox],
+        outputs=[advanced_settings]
+    )
+if __name__ == "__main__":
+    demo.queue()
+    demo.launch(ssr_mode=False, mcp_server=True)

build.py ADDED Viewed

	@@ -0,0 +1,113 @@

+#!/usr/bin/env python3
+"""
+Build script for Hugging Face Spaces - downloads model files at build time
+"""
+import os
+import sys
+import subprocess
+import logging
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+def run_download_script():
+    """Run the model download script"""
+    try:
+        logger.info("Running advanced model download script...")
+        # Try the advanced download script first
+        result = subprocess.run([sys.executable, "download_model_advanced.py"],
+                              capture_output=True, text=True, check=True)
+        logger.info("Advanced model download completed successfully")
+        logger.info(result.stdout)
+        return True
+    except subprocess.CalledProcessError as e:
+        logger.warning(f"Advanced download failed: {e}")
+        logger.warning("Falling back to basic download script...")
+        try:
+            # Fallback to basic download script
+            result = subprocess.run([sys.executable, "download_model.py"],
+                                  capture_output=True, text=True, check=True)
+            logger.info("Basic model download completed successfully")
+            logger.info(result.stdout)
+            return True
+        except subprocess.CalledProcessError as e2:
+            logger.error(f"Basic download also failed: {e2}")
+            logger.error(e2.stderr)
+            return False
+def verify_build():
+    """Verify that the build was successful"""
+    try:
+        logger.info("Verifying build results...")
+        # Check if int4 directory exists
+        if not os.path.exists("./int4"):
+            logger.error("int4 directory not found")
+            return False
+        # Check for essential files
+        essential_files = [
+            "config.json",
+            "pytorch_model.bin",
+            "tokenizer.json",
+            "tokenizer_config.json"
+        ]
+        missing_files = []
+        for file in essential_files:
+            file_path = os.path.join("./int4", file)
+            if not os.path.exists(file_path):
+                missing_files.append(file)
+        if missing_files:
+            logger.error(f"Missing essential files: {missing_files}")
+            return False
+        # Check file sizes
+        total_size = 0
+        for file in essential_files:
+            file_path = os.path.join("./int4", file)
+            if os.path.exists(file_path):
+                file_size = os.path.getsize(file_path)
+                total_size += file_size
+                logger.info(f"✅ {file}: {file_size} bytes")
+        logger.info(f"Total model size: {total_size / (1024*1024):.2f} MB")
+        if total_size < 1000000:  # Less than 1MB
+            logger.warning("Model files seem too small")
+            return False
+        logger.info("Build verification completed successfully")
+        return True
+    except Exception as e:
+        logger.error(f"Error verifying build: {e}")
+        return False
+def main():
+    """Main build function"""
+    logger.info("Starting Hugging Face Space build process...")
+    # Run the model download script
+    if run_download_script():
+        # Verify the build
+        if verify_build():
+            logger.info("Build process completed successfully")
+            return True
+        else:
+            logger.error("Build verification failed")
+            return False
+    else:
+        logger.error("Model download failed")
+        return False
+if __name__ == "__main__":
+    success = main()
+    sys.exit(0 if success else 1)

config.yaml ADDED Viewed

	@@ -0,0 +1,51 @@

+# Configuration file for Petite Elle L'Aime 3 Gradio Application
+# Model Configuration
+model:
+  main_repo: "Tonic/petite-elle-L-aime-3-sft"  # Main repo for config and chat template
+  int4_repo: "Tonic/petite-elle-L-aime-3-sft/int4"  # Int4 quantized model from HF
+  device: "auto"  # "cuda", "cpu", or "auto"
+  torch_dtype: "bfloat16"
+  trust_remote_code: true
+# System Prompt Configuration
+system_prompt:
+  default: "Tu es TonicIA, un assistant francophone rigoureux et bienveillant."
+  editable: true
+  max_length: 1000
+# Generation Parameters (defaults)
+generation:
+  max_new_tokens: 512
+  temperature: 0.7
+  top_p: 0.9
+  top_k: 50
+  repetition_penalty: 1.1
+  do_sample: true
+# Chat Configuration
+chat:
+  enable_thinking: true
+  max_history_length: 50
+# UI Configuration
+ui:
+  title: "Petite Elle L'Aime 3 - Chat Interface"
+  theme: "soft"
+  server_port: 7860
+  server_name: "0.0.0.0"
+  share: false
+  show_error: true
+  layout: "responsive"
+# Logging Configuration
+logging:
+  level: "INFO"
+  format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
+# Hardware Requirements
+hardware:
+  min_ram: "8GB"
+  recommended_ram: "16GB"
+  gpu_optional: true
+  cpu_optimized: true

deploy.py ADDED Viewed

	@@ -0,0 +1,181 @@

+#!/usr/bin/env python3
+"""
+Deployment script for Petite Elle L'Aime 3 Gradio Application
+"""
+import os
+import sys
+import subprocess
+import argparse
+import yaml
+from pathlib import Path
+def load_config():
+    """Load configuration from config.yaml"""
+    config_path = Path("config.yaml")
+    if config_path.exists():
+        with open(config_path, 'r') as f:
+            return yaml.safe_load(f)
+    return {}
+def check_dependencies():
+    """Check if all dependencies are installed"""
+    print("🔍 Checking dependencies...")
+    required_packages = [
+        'gradio',
+        'torch',
+        'transformers',
+        'accelerate'
+    ]
+    missing_packages = []
+    for package in required_packages:
+        try:
+            __import__(package)
+            print(f"✅ {package}")
+        except ImportError:
+            missing_packages.append(package)
+            print(f"❌ {package}")
+    if missing_packages:
+        print(f"\n⚠️  Missing packages: {', '.join(missing_packages)}")
+        print("Run: pip install -r requirements.txt")
+        return False
+    return True
+def check_hardware():
+    """Check hardware requirements"""
+    print("\n🔍 Checking hardware...")
+    import psutil
+    # Check RAM
+    ram_gb = psutil.virtual_memory().total / (1024**3)
+    print(f"RAM: {ram_gb:.1f} GB")
+    if ram_gb < 8:
+        print("⚠️  Warning: Less than 8GB RAM detected")
+        print("   The application may run slowly or fail to load the model")
+    else:
+        print("✅ RAM requirements met")
+    # Check GPU
+    try:
+        import torch
+        if torch.cuda.is_available():
+            gpu_name = torch.cuda.get_device_name(0)
+            gpu_memory = torch.cuda.get_device_properties(0).total_memory / (1024**3)
+            print(f"GPU: {gpu_name} ({gpu_memory:.1f} GB)")
+        else:
+            print("GPU: Not available (will use CPU)")
+    except:
+        print("GPU: Unable to detect")
+    return True
+def install_dependencies():
+    """Install dependencies from requirements.txt"""
+    print("\n📦 Installing dependencies...")
+    if not os.path.exists("requirements.txt"):
+        print("❌ requirements.txt not found")
+        return False
+    try:
+        subprocess.run([sys.executable, "-m", "pip", "install", "-r", "requirements.txt"],
+                      check=True, capture_output=True, text=True)
+        print("✅ Dependencies installed successfully")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Failed to install dependencies: {e}")
+        return False
+def run_tests():
+    """Run the test suite"""
+    print("\n🧪 Running tests...")
+    if not os.path.exists("test_app.py"):
+        print("❌ test_app.py not found")
+        return False
+    try:
+        result = subprocess.run([sys.executable, "test_app.py"],
+                              capture_output=True, text=True)
+        print(result.stdout)
+        if result.stderr:
+            print(result.stderr)
+        return result.returncode == 0
+    except Exception as e:
+        print(f"❌ Failed to run tests: {e}")
+        return False
+def start_application(port=None, host=None):
+    """Start the Gradio application"""
+    print("\n🚀 Starting application...")
+    config = load_config()
+    ui_config = config.get('ui', {})
+    # Use provided arguments or config defaults
+    port = port or ui_config.get('server_port', 7860)
+    host = host or ui_config.get('server_name', '0.0.0.0')
+    print(f"🌐 Application will be available at: http://{host}:{port}")
+    print("🛑 Press Ctrl+C to stop the application")
+    try:
+        subprocess.run([sys.executable, "app.py"], check=True)
+    except KeyboardInterrupt:
+        print("\n👋 Application stopped by user")
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Failed to start application: {e}")
+        return False
+    return True
+def main():
+    """Main deployment function"""
+    parser = argparse.ArgumentParser(description="Deploy Petite Elle L'Aime 3 Gradio Application")
+    parser.add_argument("--install", action="store_true", help="Install dependencies")
+    parser.add_argument("--test", action="store_true", help="Run tests")
+    parser.add_argument("--check", action="store_true", help="Check system requirements")
+    parser.add_argument("--port", type=int, help="Port to run the application on")
+    parser.add_argument("--host", type=str, help="Host to bind the application to")
+    parser.add_argument("--start", action="store_true", help="Start the application")
+    args = parser.parse_args()
+    print("🤖 Petite Elle L'Aime 3 - Deployment Script\n")
+    # If no arguments provided, run full deployment
+    if not any([args.install, args.test, args.check, args.start]):
+        args.install = True
+        args.test = True
+        args.check = True
+        args.start = True
+    success = True
+    if args.install:
+        success &= install_dependencies()
+    if args.check:
+        success &= check_dependencies()
+        success &= check_hardware()
+    if args.test:
+        success &= run_tests()
+    if args.start and success:
+        start_application(args.port, args.host)
+    if not success:
+        print("\n❌ Deployment failed. Please fix the issues above.")
+        sys.exit(1)
+    else:
+        print("\n✅ Deployment completed successfully!")
+if __name__ == "__main__":
+    main()

download_model.py ADDED Viewed

	@@ -0,0 +1,118 @@

+#!/usr/bin/env python3
+"""
+Helper script to download the int4 model files at build time for Hugging Face Spaces
+"""
+import os
+import sys
+import subprocess
+import logging
+from pathlib import Path
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Model configuration
+MAIN_MODEL_ID = "Tonic/petite-elle-L-aime-3-sft"
+INT4_MODEL_ID = "Tonic/petite-elle-L-aime-3-sft/int4"
+LOCAL_MODEL_PATH = "./int4"
+def download_model():
+    """Download the int4 model files to local directory"""
+    try:
+        logger.info(f"Downloading int4 model from {INT4_MODEL_ID}")
+        # Create local directory if it doesn't exist
+        os.makedirs(LOCAL_MODEL_PATH, exist_ok=True)
+        # Use huggingface_hub to download the model
+        from huggingface_hub import snapshot_download
+        # Download the int4 model files
+        snapshot_download(
+            repo_id=INT4_MODEL_ID,
+            local_dir=LOCAL_MODEL_PATH,
+            local_dir_use_symlinks=False,
+            ignore_patterns=["*.md", "*.txt", "*.git*", "*.ipynb", "*.py"]
+        )
+        logger.info(f"Model downloaded successfully to {LOCAL_MODEL_PATH}")
+        return True
+    except Exception as e:
+        logger.error(f"Error downloading model: {e}")
+        return False
+def check_model_files():
+    """Check if required model files exist"""
+    required_files = [
+        "config.json",
+        "pytorch_model.bin",
+        "tokenizer.json",
+        "tokenizer_config.json"
+    ]
+    missing_files = []
+    for file in required_files:
+        file_path = os.path.join(LOCAL_MODEL_PATH, file)
+        if not os.path.exists(file_path):
+            missing_files.append(file)
+    if missing_files:
+        logger.error(f"Missing model files: {missing_files}")
+        return False
+    logger.info("All required model files found")
+    return True
+def verify_model_integrity():
+    """Verify that the downloaded model files are valid"""
+    try:
+        # Try to load the tokenizer to verify it's working
+        from transformers import AutoTokenizer
+        tokenizer = AutoTokenizer.from_pretrained(LOCAL_MODEL_PATH)
+        logger.info("Tokenizer loaded successfully from local files")
+        # Try to load the model config
+        from transformers import AutoConfig
+        config = AutoConfig.from_pretrained(LOCAL_MODEL_PATH)
+        logger.info("Model config loaded successfully from local files")
+        return True
+    except Exception as e:
+        logger.error(f"Error verifying model integrity: {e}")
+        return False
+def main():
+    """Main function to download model at build time"""
+    logger.info("Starting model download for Hugging Face Space...")
+    # Check if model files already exist
+    if check_model_files():
+        logger.info("Model files already exist, verifying integrity...")
+        if verify_model_integrity():
+            logger.info("Model files verified successfully")
+            return True
+        else:
+            logger.warning("Model files exist but failed integrity check, re-downloading...")
+    # Download the model
+    if download_model():
+        logger.info("Model download completed successfully")
+        # Verify the downloaded files
+        if check_model_files() and verify_model_integrity():
+            logger.info("Model download and verification completed successfully")
+            return True
+        else:
+            logger.error("Model download completed but verification failed")
+            return False
+    else:
+        logger.error("Model download failed")
+        return False
+if __name__ == "__main__":
+    success = main()
+    sys.exit(0 if success else 1)

download_model_advanced.py ADDED Viewed

	@@ -0,0 +1,206 @@

+#!/usr/bin/env python3
+"""
+Advanced helper script to download the int4 model files using HfFileSystem
+"""
+import os
+import sys
+import logging
+from pathlib import Path
+from tqdm import tqdm
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Model configuration
+MAIN_MODEL_ID = "Tonic/petite-elle-L-aime-3-sft"
+INT4_MODEL_ID = "Tonic/petite-elle-L-aime-3-sft/int4"
+LOCAL_MODEL_PATH = "./int4"
+def get_file_info(fs, repo_path):
+    """Get detailed information about files in the repository"""
+    try:
+        files = fs.ls(repo_path, detail=True)
+        return [f for f in files if f['type'] == 'file']
+    except Exception as e:
+        logger.error(f"Error listing files in {repo_path}: {e}")
+        return []
+def download_with_progress(fs, remote_path, local_path, file_size):
+    """Download a file with progress bar"""
+    try:
+        # Create directory if it doesn't exist
+        os.makedirs(os.path.dirname(local_path), exist_ok=True)
+        # Download with progress bar
+        with tqdm(total=file_size, unit='B', unit_scale=True, desc=os.path.basename(local_path)) as pbar:
+            with fs.open(remote_path, 'rb') as remote_file:
+                with open(local_path, 'wb') as local_file:
+                    chunk_size = 8192
+                    while True:
+                        chunk = remote_file.read(chunk_size)
+                        if not chunk:
+                            break
+                        local_file.write(chunk)
+                        pbar.update(len(chunk))
+        return True
+    except Exception as e:
+        logger.error(f"Error downloading {remote_path}: {e}")
+        return False
+def download_model_advanced():
+    """Download the int4 model files using advanced HfFileSystem features"""
+    try:
+        logger.info(f"Downloading int4 model from {INT4_MODEL_ID}")
+        # Create local directory if it doesn't exist
+        os.makedirs(LOCAL_MODEL_PATH, exist_ok=True)
+        # Use HfFileSystem for downloading
+        from huggingface_hub import HfFileSystem
+        # Initialize the file system
+        fs = HfFileSystem()
+        # Check if repository exists
+        if not fs.exists(INT4_MODEL_ID):
+            logger.error(f"Repository {INT4_MODEL_ID} does not exist")
+            return False
+        # Get file information
+        files = get_file_info(fs, INT4_MODEL_ID)
+        if not files:
+            logger.error("No files found in repository")
+            return False
+        # Filter essential model files
+        essential_files = [
+            'config.json',
+            'pytorch_model.bin',
+            'tokenizer.json',
+            'tokenizer_config.json',
+            'special_tokens_map.json',
+            'generation_config.json'
+        ]
+        files_to_download = []
+        for file_info in files:
+            file_name = os.path.basename(file_info['name'])
+            if file_name in essential_files:
+                files_to_download.append(file_info)
+        logger.info(f"Found {len(files_to_download)} essential files to download")
+        # Download each file
+        successful_downloads = 0
+        for file_info in files_to_download:
+            file_path = file_info['name']
+            file_name = os.path.basename(file_path)
+            local_file_path = os.path.join(LOCAL_MODEL_PATH, file_name)
+            file_size = file_info.get('size', 0)
+            logger.info(f"Downloading {file_name} ({file_size} bytes)...")
+            # Download the file with progress
+            if download_with_progress(fs, file_path, local_file_path, file_size):
+                successful_downloads += 1
+                logger.info(f"Successfully downloaded {file_name}")
+            else:
+                logger.error(f"Failed to download {file_name}")
+        logger.info(f"Downloaded {successful_downloads}/{len(files_to_download)} files")
+        return successful_downloads == len(files_to_download)
+    except Exception as e:
+        logger.error(f"Error downloading model: {e}")
+        return False
+def verify_download_advanced():
+    """Advanced verification of downloaded model files"""
+    try:
+        logger.info("Verifying downloaded model files...")
+        # Expected file sizes (approximate)
+        expected_files = {
+            "config.json": (1000, 10000),  # (min_size, max_size) in bytes
+            "pytorch_model.bin": (1000000, 5000000000),  # Should be several MB
+            "tokenizer.json": (10000, 1000000),  # Should be several KB
+            "tokenizer_config.json": (100, 10000),  # Minimum size
+            "special_tokens_map.json": (100, 10000),
+            "generation_config.json": (100, 10000)
+        }
+        verification_results = []
+        for file_name, (min_size, max_size) in expected_files.items():
+            file_path = os.path.join(LOCAL_MODEL_PATH, file_name)
+            if os.path.exists(file_path):
+                actual_size = os.path.getsize(file_path)
+                if min_size <= actual_size <= max_size:
+                    logger.info(f"✅ {file_name} verified ({actual_size} bytes)")
+                    verification_results.append(True)
+                else:
+                    logger.warning(f"⚠️ {file_name} size unexpected ({actual_size} bytes)")
+                    verification_results.append(False)
+            else:
+                logger.error(f"❌ Missing {file_name}")
+                verification_results.append(False)
+        success_rate = sum(verification_results) / len(verification_results)
+        logger.info(f"Verification complete: {sum(verification_results)}/{len(verification_results)} files valid")
+        return success_rate >= 0.8  # Allow 20% tolerance
+    except Exception as e:
+        logger.error(f"Error verifying files: {e}")
+        return False
+def check_model_files():
+    """Check if required model files exist"""
+    required_files = [
+        "config.json",
+        "pytorch_model.bin",
+        "tokenizer.json",
+        "tokenizer_config.json"
+    ]
+    missing_files = []
+    for file in required_files:
+        file_path = os.path.join(LOCAL_MODEL_PATH, file)
+        if not os.path.exists(file_path):
+            missing_files.append(file)
+    if missing_files:
+        logger.error(f"Missing model files: {missing_files}")
+        return False
+    logger.info("All required model files found")
+    return True
+def main():
+    """Main function to download model at build time"""
+    logger.info("Starting advanced model download for Hugging Face Space...")
+    # Check if model files already exist
+    if check_model_files():
+        logger.info("Model files already exist, skipping download")
+        return True
+    # Download the model using advanced method
+    if download_model_advanced():
+        # Verify the download
+        if verify_download_advanced():
+            logger.info("Model download and verification completed successfully")
+            return True
+        else:
+            logger.error("Model verification failed")
+            return False
+    else:
+        logger.error("Model download failed")
+        return False
+if __name__ == "__main__":
+    success = main()
+    sys.exit(0 if success else 1)

requirements.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+gradio>=5.38.2
+torch>=2.0.0
+transformers>=4.54.0
+accelerate>=0.20.0
+torchao>=0.1.0
+safetensors>=0.4.0
+tokenizers>=0.21.2
+pyyaml>=6.0
+psutil>=5.9.0
+huggingface_hub>=0.20.0
+tqdm>=4.64.0

test_app.py ADDED Viewed

	@@ -0,0 +1,212 @@

+#!/usr/bin/env python3
+"""
+Test script for the Petite Elle L'Aime 3 Gradio application
+"""
+import sys
+import os
+import importlib.util
+def test_imports():
+    """Test if all required packages can be imported"""
+    required_packages = [
+        'gradio',
+        'torch',
+        'transformers',
+        'accelerate',
+        'safetensors',
+        'tokenizers'
+    ]
+    print("Testing imports...")
+    for package in required_packages:
+        try:
+            __import__(package)
+            print(f"✅ {package} imported successfully")
+        except ImportError as e:
+            print(f"❌ Failed to import {package}: {e}")
+            return False
+    return True
+def test_app_structure():
+    """Test if the app.py file has the correct structure"""
+    print("\nTesting app.py structure...")
+    if not os.path.exists('app.py'):
+        print("❌ app.py not found")
+        return False
+    try:
+        # Import the app module
+        spec = importlib.util.spec_from_file_location("app", "app.py")
+        app_module = importlib.util.module_from_spec(spec)
+        spec.loader.exec_module(app_module)
+        # Check for required functions
+        required_functions = [
+            'load_model',
+            'create_prompt',
+            'generate_response',
+            'user',
+            'bot'
+        ]
+        for func_name in required_functions:
+            if hasattr(app_module, func_name):
+                print(f"✅ {func_name} function found")
+            else:
+                print(f"❌ {func_name} function not found")
+                return False
+        # Check for required variables
+        required_variables = [
+            'DEFAULT_SYSTEM_PROMPT',
+            'title',
+            'description',
+            'presentation1',
+            'presentation2',
+            'joinus'
+        ]
+        for var_name in required_variables:
+            if hasattr(app_module, var_name):
+                print(f"✅ {var_name} variable found")
+            else:
+                print(f"❌ {var_name} variable not found")
+                return False
+        print("✅ All required functions and variables found")
+        return True
+    except Exception as e:
+        print(f"❌ Error testing app.py: {e}")
+        return False
+def test_requirements():
+    """Test if requirements.txt exists and has required packages"""
+    print("\nTesting requirements.txt...")
+    if not os.path.exists('requirements.txt'):
+        print("❌ requirements.txt not found")
+        return False
+    required_packages = [
+        'gradio',
+        'torch',
+        'transformers',
+        'accelerate'
+    ]
+    with open('requirements.txt', 'r') as f:
+        content = f.read()
+    for package in required_packages:
+        if package in content:
+            print(f"✅ {package} found in requirements.txt")
+        else:
+            print(f"❌ {package} not found in requirements.txt")
+            return False
+    return True
+def test_config():
+    """Test if config.yaml exists and has required sections"""
+    print("\nTesting config.yaml...")
+    if not os.path.exists('config.yaml'):
+        print("❌ config.yaml not found")
+        return False
+    try:
+        import yaml
+        with open('config.yaml', 'r') as f:
+            config = yaml.safe_load(f)
+        required_sections = [
+            'model',
+            'system_prompt',
+            'generation',
+            'chat',
+            'ui'
+        ]
+        for section in required_sections:
+            if section in config:
+                print(f"✅ {section} section found in config.yaml")
+            else:
+                print(f"❌ {section} section not found in config.yaml")
+                return False
+        # Check system prompt default
+        if 'system_prompt' in config and 'default' in config['system_prompt']:
+            print(f"✅ System prompt default found: {config['system_prompt']['default']}")
+        else:
+            print("❌ System prompt default not found")
+            return False
+        return True
+    except Exception as e:
+        print(f"❌ Error testing config.yaml: {e}")
+        return False
+def test_readme():
+    """Test if README.md has the correct structure"""
+    print("\nTesting README.md...")
+    if not os.path.exists('README.md'):
+        print("❌ README.md not found")
+        return False
+    with open('README.md', 'r') as f:
+        content = f.read()
+    required_sections = [
+        'Petite Elle L\'Aime 3',
+        'Features',
+        'Installation',
+        'Usage'
+    ]
+    for section in required_sections:
+        if section in content:
+            print(f"✅ {section} section found")
+        else:
+            print(f"❌ {section} section not found")
+            return False
+    return True
+def main():
+    """Run all tests"""
+    print("🧪 Testing Petite Elle L'Aime 3 Gradio Application\n")
+    tests = [
+        test_imports,
+        test_app_structure,
+        test_requirements,
+        test_config,
+        test_readme
+    ]
+    passed = 0
+    total = len(tests)
+    for test in tests:
+        if test():
+            passed += 1
+        print()
+    print(f"📊 Test Results: {passed}/{total} tests passed")
+    if passed == total:
+        print("🎉 All tests passed! The application is ready to run.")
+        print("\nTo run the application:")
+        print("python app.py")
+    else:
+        print("❌ Some tests failed. Please fix the issues before running the application.")
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

test_model_loading.py ADDED Viewed

	@@ -0,0 +1,81 @@

+#!/usr/bin/env python3
+"""
+Test script to verify model loading functionality
+"""
+import os
+import sys
+import logging
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+def test_model_loading():
+    """Test the model loading functionality"""
+    try:
+        logger.info("Testing model loading...")
+        # Import the app module to test model loading
+        from app import load_model, check_local_model
+        # Check if local model exists
+        has_local = check_local_model()
+        logger.info(f"Local model available: {has_local}")
+        # Try to load the model
+        success = load_model()
+        logger.info(f"Model loading successful: {success}")
+        return success
+    except Exception as e:
+        logger.error(f"Error testing model loading: {e}")
+        return False
+def test_download_script():
+    """Test the download script"""
+    try:
+        logger.info("Testing download script...")
+        # Import and run the download script
+        from download_model import main as download_main
+        success = download_main()
+        logger.info(f"Download script successful: {success}")
+        return success
+    except Exception as e:
+        logger.error(f"Error testing download script: {e}")
+        return False
+def main():
+    """Main test function"""
+    logger.info("Starting model loading tests...")
+    # Test download script first
+    logger.info("=== Testing Download Script ===")
+    download_success = test_download_script()
+    # Test model loading
+    logger.info("=== Testing Model Loading ===")
+    loading_success = test_model_loading()
+    # Summary
+    logger.info("=== Test Summary ===")
+    logger.info(f"Download script: {'PASS' if download_success else 'FAIL'}")
+    logger.info(f"Model loading: {'PASS' if loading_success else 'FAIL'}")
+    overall_success = download_success and loading_success
+    if overall_success:
+        logger.info("All tests passed! Model is ready for deployment.")
+    else:
+        logger.error("Some tests failed. Please check the logs above.")
+    return overall_success
+if __name__ == "__main__":
+    success = main()
+    sys.exit(0 if success else 1)