parcadei commited on
Commit
f686a40
·
verified ·
1 Parent(s): 8199793

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +194 -0
README.md ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ library_name: transformers
6
+ base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
7
+ tags:
8
+ - code
9
+ - code-editing
10
+ - merge
11
+ - fastedit
12
+ - qwen2
13
+ pipeline_tag: text-generation
14
+ ---
15
+
16
+ # FastEdit 1.7B
17
+
18
+ A fine-tuned **Qwen2.5-Coder-1.5B-Instruct** for merging code edit snippets into source files. Given an original code chunk (~35 lines) and a compact edit snippet with context markers, the model produces the merged result.
19
+
20
+ This model is designed to be used with the [FastEdit](https://github.com/parcadei/fastedit) toolkit, which handles AST scoping, deterministic edits, and post-processing. **Using the model directly requires the exact prompt format described below.**
21
+
22
+ ## Model variants
23
+
24
+ All variants are in this repo under subfolders:
25
+
26
+ | Subfolder | Format | Size | Use case |
27
+ |-----------|--------|------|----------|
28
+ | `bf16/` | BF16 safetensors | 3.2 GB | Fine-tuning, reference, GPU serving via vLLM/TGI |
29
+ | `mlx-8bit/` | MLX 8-bit | 1.7 GB | Apple Silicon (recommended for local use) |
30
+ | `gguf/` | GGUF Q8_0 | 1.7 GB | llama.cpp, LM Studio, Ollama |
31
+
32
+ ## Prompt format
33
+
34
+ The model expects a specific 2-message chat format. **Using a different prompt will produce poor results.**
35
+
36
+ ### System message
37
+
38
+ ```
39
+ You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated. /no_think
40
+ ```
41
+
42
+ The `/no_think` suffix disables Qwen's thinking mode — without it, the model may emit thousands of reasoning tokens before producing output.
43
+
44
+ ### User message
45
+
46
+ ```
47
+ Merge all changes from the <update> snippet into the <code> below.
48
+ - Preserve the code's structure, order, comments, and indentation exactly.
49
+ - Output only the updated code, enclosed within <updated-code> and </updated-code> tags.
50
+ - Do not include any additional text, explanations, placeholders, ellipses, or code fences.
51
+
52
+ <code>{original_code}</code>
53
+
54
+ <update>{update_snippet}</update>
55
+
56
+ Provide the complete updated code.
57
+ ```
58
+
59
+ ### Expected output
60
+
61
+ The model outputs the merged code wrapped in `<updated-code>` tags:
62
+
63
+ ```
64
+ <updated-code>
65
+ def process(data):
66
+ try:
67
+ result = transform(data)
68
+ return result
69
+ except Error as e:
70
+ return {"error": str(e)}
71
+ </updated-code>
72
+ ```
73
+
74
+ ### Complete example
75
+
76
+ **Original code** (what tree-sitter extracts for the target function):
77
+
78
+ ```python
79
+ def process(data):
80
+ result = transform(data)
81
+ return result
82
+ ```
83
+
84
+ **Edit snippet** (what the user/agent writes):
85
+
86
+ ```python
87
+ def process(data):
88
+ try:
89
+ # ... existing code ...
90
+ except Error as e:
91
+ return {"error": str(e)}
92
+ ```
93
+
94
+ **Model output:**
95
+
96
+ ```python
97
+ <updated-code>
98
+ def process(data):
99
+ try:
100
+ result = transform(data)
101
+ return result
102
+ except Error as e:
103
+ return {"error": str(e)}
104
+ </updated-code>
105
+ ```
106
+
107
+ The model understands `# ... existing code ...` markers (and language-specific variants like `// ... existing code ...`) as instructions to preserve the original lines in that region.
108
+
109
+ ## How it fits into FastEdit
110
+
111
+ In production, the model is the **fallback** — not the primary path:
112
+
113
+ 1. **AST scoping** — tree-sitter finds the target function by name (~35 lines), so the model never sees the whole file
114
+ 2. **Deterministic text-match** �� 74% of edits are resolved by matching context lines and splicing in new lines (0 tokens, <1ms)
115
+ 3. **Model merge** — the remaining 26% of edits (structural changes like wrapping in try/catch, full rewrites) go to this model
116
+
117
+ The model only ever processes ~35-line chunks. It was trained on function-scoped edits, not whole files. Feeding it large inputs will degrade quality.
118
+
119
+ ## Using without FastEdit
120
+
121
+ If you want to use the model directly (without the toolkit), you need to:
122
+
123
+ 1. **Scope the input yourself** — extract only the target function/class, not the whole file
124
+ 2. **Use the exact prompt format** above — different prompts will produce different (worse) results
125
+ 3. **Parse the output** — extract text between `<updated-code>` and `</updated-code>` tags
126
+ 4. **Handle edge cases** — the model may emit `<think>` blocks (strip them), use variant tag names (`<update-code>`, `<updated_code>`), or truncate output on long functions
127
+
128
+ ```python
129
+ from transformers import AutoModelForCausalLM, AutoTokenizer
130
+
131
+ # BF16 (GPU / fine-tuning)
132
+ model = AutoModelForCausalLM.from_pretrained("continuous-lab/FastEdit", subfolder="bf16", torch_dtype="auto")
133
+ tokenizer = AutoTokenizer.from_pretrained("continuous-lab/FastEdit", subfolder="bf16")
134
+
135
+ messages = [
136
+ {"role": "system", "content": "You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated. /no_think"},
137
+ {"role": "user", "content": """Merge all changes from the <update> snippet into the <code> below.
138
+ - Preserve the code's structure, order, comments, and indentation exactly.
139
+ - Output only the updated code, enclosed within <updated-code> and </updated-code> tags.
140
+ - Do not include any additional text, explanations, placeholders, ellipses, or code fences.
141
+
142
+ <code>def process(data):
143
+ result = transform(data)
144
+ return result</code>
145
+
146
+ <update>def process(data):
147
+ try:
148
+ # ... existing code ...
149
+ except Error as e:
150
+ return {"error": str(e)}</update>
151
+
152
+ Provide the complete updated code."""}
153
+ ]
154
+
155
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
156
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
157
+ outputs = model.generate(**inputs, max_new_tokens=512, temperature=0)
158
+ result = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
159
+ # Parse: extract text between <updated-code> and </updated-code>
160
+ ```
161
+
162
+ ## Training
163
+
164
+ - **Base model**: Qwen2.5-Coder-1.5B-Instruct
165
+ - **Task**: Code edit merging across 13 languages
166
+
167
+ ## Evaluation
168
+
169
+ Tested on 22 structurally distinct edit patterns (73 cases) across 13 languages:
170
+
171
+ | Path | Accuracy | Avg tokens | Avg latency |
172
+ |------|----------|------------|-------------|
173
+ | Deterministic (74% of edits) | 100% | 0 | <1ms |
174
+ | Model (26% of edits) | 92% | ~40 | ~500ms |
175
+ | **Combined** | **~98%** | **~10** | **~130ms** |
176
+
177
+ Per-language model accuracy (156-example benchmark):
178
+
179
+ | Language | Accuracy |
180
+ |----------|----------|
181
+ | Python, Java, Kotlin, C, PHP | 92% |
182
+ | JavaScript, TypeScript, Rust, Swift | 85% |
183
+ | Go, C++, Ruby | 77% |
184
+
185
+ ## Limitations
186
+
187
+ - Performance degrades on inputs longer than ~100 lines.
188
+ - Does not handle whole-file edits well — use the FastEdit toolkit's AST scoping.
189
+ - The edit snippet must use `# ... existing code ...` markers (or language-equivalent) for context preservation. Without markers, the model treats the entire snippet as a replacement.
190
+ - Languages not in the training set may work but are untested.
191
+
192
+ ## License
193
+
194
+ MIT