constructai
/

kiro-1.0-7B-XCode

@@ -7,18 +7,174 @@ base_model:
 - Qwen/Qwen2.5-Coder-7B
 ---
-# Kiro 1.0 XCode
-## Model description
-**kiro-1.0-7B-XCode** is a 7-billion-parameter language model based on the **Qwen2.5-Coder-7B** architecture. It has been further trained on code-specific data to excel at various programming-related tasks, including code generation, completion, explanation, translation between programming languages, and answering coding-related questions.
-[Напиши здесь описание модели]
-## Intended uses & limitations
-<!-- Для каких целей предназначена модель? Где её можно применять, а где не стоит? Какие есть ограничения? -->
-[Опиши предполагаемое использование и ограничения]

 - Qwen/Qwen2.5-Coder-7B
 ---
+# kiro-1.0-7B-XCode
+<div align="center">
+**kiro-1.0-7B-XCode** — a code-focused language model fine-tuned on top of Qwen2.5-Coder-7B,
+trained on a mixed dataset of real-world code and instruction pairs.
+[![HuggingFace](https://img.shields.io/badge/🤗%20HuggingFace-constructai%2Fkiro--1.0--7B--XCode-yellow)](https://huggingface.co/constructai/kiro-1.0-7B-XCode)
+[![License](https://img.shields.io/badge/License-Apache%202.0-blue)](https://opensource.org/licenses/Apache-2.0)
+[![Model Size](https://img.shields.io/badge/Parameters-7B-green)](https://huggingface.co/constructai/kiro-1.0-7B-XCode)
+[![Base Model](https://img.shields.io/badge/Base-Qwen2.5--Coder--7B-orange)](https://huggingface.co/Qwen/Qwen2.5-Coder-7B)
+</div>
+---
+## 📖 Overview
+**kiro-1.0-7B-XCode** is the first model in the **kiro** series by [constructai](https://huggingface.co/constructai).
+This model is specialized for writing, analyzing, and explaining code in Python and JavaScript. It is trained to follow instructions in the `### Instruction → ### Response` format, making it suitable for IDE plugins, coding assistants, and code review tools.
+---
+## 🏋️ Training
+| Parameter | Value |
+|---|---|
+| Base model | `Qwen/Qwen2.5-Coder-7B` |
+| Method | QLoRA (4-bit, NF4) + LoRA merge |
+| LoRA rank | 16 |
+| LoRA alpha | 32 |
+| Epochs | 1 |
+| Learning rate | 2e-4 |
+| Scheduler | Cosine |
+| Hardware | NVIDIA RTX A5000 24GB |
+### Dataset
+The model was trained on ~58,000 samples from a mixed dataset:
+| Source | Samples | Description |
+|---|---|---|
+| `bigcode/the-stack-smol` (Python) | 20,000 | Real-world Python code from GitHub |
+| `bigcode/the-stack-smol` (JavaScript) | 20,000 | Real-world JavaScript code from GitHub |
+| `iamtarun/python_code_instructions_18k_alpaca` | 18,000 | Python instruction-response pairs |
+---
+## 🚀 Quick Start
+### Installation
+```bash
+pip install transformers torch accelerate
+```
+### Inference
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_name = "constructai/kiro-1.0-7B-XCode"
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True,
+)
+prompt = "### Instruction:\nWrite a Python function that checks if a number is prime.\n\n### Response:\n"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=512,
+    do_sample=False,
+    repetition_penalty=1.3,
+    pad_token_id=tokenizer.eos_token_id,
+)
+response = tokenizer.decode(
+    outputs[0][inputs["input_ids"].shape[1]:],
+    skip_special_tokens=True
+)
+print(response)
+```
+### Prompt Format
+```
+### Instruction:
+{your request}
+### Response:
+```
+With additional context:
+```
+### Instruction:
+{your request}
+### Input:
+{additional context or code}
+### Response:
+```
+---
+## 📊 Example
+**Prompt:**
+```
+### Instruction:
+Write a Python function that checks if a number is prime.
+### Response:
+```
+**kiro-1.0 output:**
+```python
+def is_prime(num):
+    for i in range(2, num):
+        if (num % i) == 0:
+            return False
+    return True
+```
+---
+## 🗺️ Roadmap
+This is the first release of the kiro model series. Upcoming versions:
+- **kiro-1.5-7B-XCode** — larger dataset (500k+ samples), improved benchmarks
+- **kiro-2.0-7B-XCode** — instruction tuning + DPO alignment
+- **kiro-3.0-14B-XCode** — larger base model
+- **ZuKU** — custom architecture trained from scratch (100–200M parameters)
+---
+## ⚠️ Limitations
+- Trained for 1 epoch — may produce repetitions in long outputs (use `repetition_penalty=1.3`)
+- Optimized for Python and JavaScript — other languages have limited support
+- This is v1.0 — quality will improve in future releases
+---
+## 📜 License
+This model is released under the **Apache 2.0** license, inherited from the base model Qwen2.5-Coder-7B.
+---
+## 🙏 Acknowledgements
+- [Qwen Team](https://huggingface.co/Qwen) for the excellent base model
+- [BigCode](https://huggingface.co/bigcode) for The Stack dataset
+- [Hugging Face](https://huggingface.co) for the infrastructure
+---
+<div align="center">
+Made with ❤️ by <a href="https://huggingface.co/constructai">constructai</a>
+</div>