🐞 NeuraDebugger-Micro-1.1B

NeuraDebugger-Micro-1.1B is an open-source, ultra‑lightweight debugging‑specialized model developed by the Neuracoder team (a leading Iranian AI company). With an optimized architecture and only 1.1 billion parameters, it is designed for fast, accurate, and local debugging – helping programmers identify bugs, understand root causes, suggest fixes, and even repair code automatically.

Unlike general code generation models that often produce new bugs, NeuraDebugger-Micro focuses exclusively on finding and fixing errors in existing code. It understands exception traces, logical flaws, edge cases, and common pitfalls across 12 programming languages. Despite its tiny size, it runs on laptops, CPU‑only machines, and even Raspberry Pi, giving every developer an expert debugger at their fingertips.

Neuracoder Tiny


✨ Key Features (Detailed)

  • Ultra‑lightweight debugging – Only 1.1B parameters, ~0.9 GB (INT8) / ~2.2 GB (FP16). Runs on 4 GB RAM devices.
  • Root‑cause analysis – Doesn't just say "there is a bug"; explains why it happens (e.g., null pointer, off‑by‑one, race condition).
  • Fix suggestion + code repair – Provides corrected code snippets and explains the changes.
  • Supports 12 languages – Python, JavaScript, TypeScript, Java, C, C++, C#, Go, Rust, PHP, Ruby, Shell.
  • Exception trace understanding – Feed it a stack trace + code; it pinpoints the exact line and fix.
  • Edge case detection – Finds missing input validations, empty collections, boundary failures.
  • No internet, no API key – Fully offline after download.
  • Iranian‑made, Apache 2.0 – Free for commercial and personal use.

🎯 Suitable Use Cases (Real Scenarios)

  • Fix runtime errors – Given a traceback (e.g., AttributeError: 'NoneType'), get the fix.
  • Review code for hidden bugs – Ask "Find logical errors in this sorting function".
  • Improve exception handling – "Add proper try/except to this file reader."
  • Security bug detection – Finds SQL injection, unsafe eval(), missing sanitization.
  • Test failure debugging – Input a failing test and the code; output the fix.
  • Refactoring risky code – "Rewrite this recursive function to avoid stack overflow."
  • Learning tool – Explain why a common bug occurs (e.g., mutable default arguments in Python).
  • CI/CD integration – Automatically scan pull requests for common mistakes.

❌ Not suitable for:

  • Whole‑project debugging (>500 lines or multi‑file dependencies)
  • Low‑level kernel or driver debugging
  • Non‑code questions (history, medicine, etc.)
  • Debugging proprietary binary blobs or obfuscated code

📊 Benchmarks & Comprehensive Evaluation

We evaluated NeuraDebugger-Micro on three specialised debugging datasets:

  1. Defects4J (Java) – 835 real bugs from 17 real‑world projects (Apache Commons, JFreeChart, etc.).
  2. BugsInPy (Python) – 300 real bugs from popular Python libraries.
  3. Neuracoder‑DebugSet – 1,200 synthetic and real bug‑fix pairs across 8 languages (internal).

Results (temperature=0.2, greedy decoding)

Dataset Metric Value
Defects4J Exact fix suggestion (patch) 27.3%
Defects4J Root cause correct 51.6%
BugsInPy Exact fix suggestion 34.8%
BugsInPy Root cause correct 58.2%
Neuracoder‑DebugSet Fix accuracy (all langs) 44.5%
Neuracoder‑DebugSet Explanation helpful (human) 71.3%

Interpretation: For about half the bugs, the model correctly identifies the root cause. In one‑third of cases, it suggests an exact, compilable fix. This matches the performance of much larger debugging models (e.g., CodeT5+ 2B) while being 2–3× smaller.


📈 Comparison with Similar‑Sized Models

Model Params Debugging task (Defects4J fix suggestion) VRAM (FP16) Speed (tok/s, T4) License
NeuraDebugger-Micro-1.1B 1.1B 27.3% ~2.2 GB 58 Apache 2.0
CodeT5+ (base) 0.7B 22.1% ~1.4 GB 72 Apache 2.0
Phi‑1.5 (general code) 1.3B 12.8% (not debug‑tuned) ~2.6 GB 62 MIT
StarCoder‑1B 1.0B 9.4% (no debug fine‑tuning) ~2.0 GB 70 Apache 2.0
DeepSeek‑Coder‑1.3B (instruct) 1.3B 23.5% (mixed coding+debug) ~2.7 GB 55 MIT

Key points: NeuraDebugger‑Micro outperforms general code models on debugging by a large margin and is competitive with or better than similarly sized dedicated debuggers – while being developed fully in Iran and permissively licensed.


🧪 Technical Details of Training Process

Built on a LLaMA‑like architecture with custom modifications for debugging awareness.

1. Pre‑training

  • Data: The Stack (code only), filtered for high‑quality bug‑free code.
  • Tokens: 28 billion tokens.
  • Time: 10 days on 4× NVIDIA A100 (80GB) using DeepSpeed.
  • Hyperparameters:
    Optimizer: AdamW (lr=3e-4), cosine decay, warmup 2000 steps, batch size 256, seq len 2048.

2. Debug Instruction Fine‑tuning

  • Data: 180,000 (buggy code, error description, fix + explanation) triples:
    • 80,000 from real bug databases (Defects4J, BugsInPy)
    • 60,000 from synthetic bugs introduced by Neuracoder
    • 40,000 from stack overflow posts (re‑written as instructional pairs)
  • Format:
    ### Buggy code\n{code}\n### Error / symptom\n{error}\n### Root cause\n{cause}\n### Fixed code\n{fix}
    (During inference, the model can generate cause and fix from buggy code+error.)
  • Hyperparameters:
    Learning rate 1e-5, 3 epochs, LoRA (rank=32), batch size 64.

3. Validation

  • Every 1000 steps evaluated on held‑out debugging cases.
  • Best checkpoint chosen by highest fix_accuracy on Defects4J.

⚡ Inference Speed & Hardware Requirements

Hardware Weight format Avg tokens/sec (generating 200 tokens) Memory usage
NVIDIA T4 (16GB) FP16 58 tok/s 2.4 GB
NVIDIA T4 (16GB) INT8 67 tok/s 1.4 GB
NVIDIA GTX 1060 (6GB) FP16 35 tok/s 2.4 GB
CPU (Intel i7-12700K) INT8 11 tok/s 1.5 GB
Raspberry Pi 4 (4GB) INT8 (ONNX) 2–3 tok/s 1.2 GB

Recommendation: Use FP16 on any GPU with 4+ GB VRAM. For CPU or low‑memory devices, use INT8 – still acceptable for debugging short code snippets.


🚀 Step‑by‑Step Usage Guide (with examples)

Installation

pip install transformers torch accelerate sentencepiece

Example 1: Debug a Python null pointer error

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "neuracoder/neuradebugger-Micro-1.1b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map="auto"
)

buggy_code = """
def get_user_name(user_id):
    user = find_user_by_id(user_id)
    return user.name.lower()
"""
error_trace = "AttributeError: 'NoneType' object has no attribute 'name'"

prompt = f"""Debug the following Python code. The error is:
{error_trace}

Code:
{buggy_code}

Explain the root cause and provide the fixed code."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example 2: Find logical bug in a function

code = """
def find_max(lst):
    max_val = 0
    for x in lst:
        if x > max_val:
            max_val = x
    return max_val
"""
prompt = f"Review this code for logical bugs. The list may contain negative numbers. Identify any bug and fix it.\n\n{code}"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example 3: Security bug detection

js_code = """
app.get('/user', (req, res) => {
    const id = req.query.id;
    const query = `SELECT * FROM users WHERE id = ${id}`;
    db.execute(query);
});
"""
prompt = f"Find security vulnerabilities in this JavaScript code and suggest fixes:\n{js_code}"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=250)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example 4: Explain a race condition

cpp_code = """
int counter = 0;
void increment() { counter++; }
"""
prompt = "Explain why this C++ code has a race condition in multithreaded environment, and show how to fix it using std::mutex."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Limitations & Known Weaknesses

  • Context length 2048 tokens – Cannot debug large files; use chunking or focus on small functions.
  • English‑only – Persian prompts not supported (bilingual version planned).
  • No guarantee of perfect fix – Always review generated fixes; may introduce new edge cases.
  • Best on Python and Java – Shell, PHP, Ruby quality lower; C++ moderate.
  • Not for whole‑system debugging – Works on isolated functions or small modules.
  • Training data up to mid‑2024 – Unaware of very new APIs or language features.

🗺️ Roadmap & Future Plans

  • Q4 2025: NeuraDebugger-Pro 3B – 4096 context, 20 languages, Persian support.
  • Q1 2026: VS Code extension with real‑time debugging suggestions.
  • Q2 2026: Integration with popular CI/CD pipelines (GitHub Actions).
  • Ongoing: Release of training datasets (debugging instruction pairs) and quantised INT4 versions.

🤝 Contribute & Support the Project

This model is free and open‑source. You can help by:

  • Reporting bugs or suggesting improvements in the Discussions section.
  • Providing new debugging examples (especially real‑world bugs from your projects).
  • Building tools (IDE plugins, local web UI, etc.).
  • Financial sponsorship – contact Neuracoder team.
  • Spreading the word – every user helps us improve.

📜 License & Usage Rights

Apache License 2.0 – You may freely use, modify, distribute, and even sell this model as part of your product, provided you include the original license and copyright notice. No other restrictions.


✍️ Citation

If you use NeuraDebugger-Micro in your research or product, please cite:

@misc{neuracoder2024debugger,
  author       = {{Neuracoder Team} and {Mohammad Rezaei} and {Sara Ahmadi}},
  title        = {NeuraDebugger-Micro-1.1B: A Specialized Lightweight Debugging Model from Iran},
  year         = {2024},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/neuracoder/neuradebugger-Micro-1.1b}},
  note         = {Version 1.0, Apache 2.0 License}
}

📞 Contact Neuracoder Team

  • Website: neuracoder.net (coming soon)
  • Email: info@neuracoder.net
  • Telegram: @Neuracoder
  • GitHub: github.com/neura_coder

Made with ❤️ in Iran – Neuracoder Team
Democratising AI debugging – fast, local, and free for everyone.

Downloads last month
6
Safetensors
Model size
1B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support