Spaces:

ceoavinash
/

codearena-rl

Sleeping

App Files Files Community

havinashpatil commited on 16 days ago

Commit

271cc02

1 Parent(s): 402970c

Add AI coding system with local Hugging Face LLM integration

Browse files

Files changed (4) hide show

README.md +54 -0
ai_fix.bat +5 -0
ai_fix.py +91 -0
requirements.txt +2 -0

README.md CHANGED Viewed

@@ -119,6 +119,60 @@ CodeArena is infrastructure. Plug any model in. Run it. Get a number.
    python create_tasks.py
    ```
 ## Usage
 ### 0. Training with TRL (Colab)

    python create_tasks.py
    ```
+## AI Coding System (Local Hugging Face LLM)
+CodeArena now includes a built-in AI code fixer using Hugging Face Transformers for local, offline code repair.
+### Features
+- **Local LLM**: No API keys or internet required
+- **Fast Fixes**: Uses TinyLlama-1.1B for quick code corrections
+- **Command Line**: Simple stdin/stdout interface
+- **Optimized Prompts**: Engineered for code repair tasks
+### Setup
+1. **Install Dependencies:**
+   ```bash
+   pip install accelerate bitsandbytes  # Added to requirements.txt
+   ```
+2. **First Run (Model Download):**
+   ```bash
+   python ai_fix.py < any_code.py
+   ```
+   This will download the model (~600MB) on first use.
+### Usage
+**Fix a Python file:**
+```bash
+cat buggy_code.py | python ai_fix.py
+```
+**Interactive fixing:**
+```bash
+# Windows
+type buggy_code.py | ai_fix.bat
+# Linux/Mac
+cat buggy_code.py | python ai_fix.py
+```
+**Example:**
+```bash
+echo "def hello()
+    print('world')" | python ai_fix.py
+# Output: def hello():
+#             print('world')
+```
+### Model Options
+- **Default**: `TinyLlama/TinyLlama-1.1B-Chat-v1.0` (fast, lightweight)
+- **Change model**: Edit `MODEL_NAME` in `ai_fix.py`
+### Performance
+- **CPU**: ~10-30 seconds per fix
+- **GPU**: ~2-5 seconds per fix
+- **Memory**: ~2GB RAM minimum
 ## Usage
 ### 0. Training with TRL (Colab)

ai_fix.bat ADDED Viewed

	@@ -0,0 +1,5 @@

+@echo off
+REM AI Code Fixer Batch Script
+REM Usage: type code.py | ai_fix.bat
+python ai_fix.py

ai_fix.py ADDED Viewed

	@@ -0,0 +1,91 @@

+#!/usr/bin/env python3
+"""
+AI Code Fixer using Hugging Face Transformers
+Reads code from stdin, fixes it using TinyLlama, outputs fixed code.
+"""
+import sys
+import os
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+# Model configuration
+MODEL_NAME = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
+def load_model():
+    """Load the model and tokenizer."""
+    print("Loading model...", file=sys.stderr)
+    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
+    # Try to use GPU if available, fallback to CPU
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    print(f"Using device: {device}", file=sys.stderr)
+    model = AutoModelForCausalLM.from_pretrained(
+        MODEL_NAME,
+        torch_dtype=torch.float16 if device == "cuda" else torch.float32,
+        device_map="auto" if device == "cuda" else None,
+        low_cpu_mem_usage=True
+    )
+    if device == "cpu":
+        model = model.to(device)
+    return model, tokenizer
+def generate_fix(model, tokenizer, code):
+    """Generate fixed code using the model."""
+    prompt = f"""You are an expert competitive programmer.
+Fix the following Python code:
+- Remove syntax errors
+- Ensure correct logic
+- Optimize to O(n) if possible
+Code:
+{code}
+Return ONLY corrected code.
+"""
+    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+    with torch.no_grad():
+        output = model.generate(
+            **inputs,
+            max_new_tokens=500,
+            temperature=0.3,  # Lower temperature for more deterministic fixes
+            do_sample=True,
+            top_p=0.9,
+            pad_token_id=tokenizer.eos_token_id
+        )
+    # Decode and extract only the code part
+    full_output = tokenizer.decode(output[0], skip_special_tokens=True)
+    # Try to extract just the code after the prompt
+    if "Return ONLY corrected code." in full_output:
+        code_part = full_output.split("Return ONLY corrected code.")[-1].strip()
+    else:
+        code_part = full_output.replace(prompt, "").strip()
+    return code_part
+def main():
+    # Read code from stdin
+    code = sys.stdin.read().strip()
+    if not code:
+        print("No code provided", file=sys.stderr)
+        sys.exit(1)
+    try:
+        model, tokenizer = load_model()
+        fixed_code = generate_fix(model, tokenizer, code)
+        print(fixed_code)
+    except Exception as e:
+        print(f"Error: {e}", file=sys.stderr)
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

requirements.txt CHANGED Viewed

@@ -9,3 +9,5 @@ transformers
 torch
 datasets
 trl

 torch
 datasets
 trl
+accelerate
+bitsandbytes