AIencoder commited on
Commit
0249933
Β·
verified Β·
1 Parent(s): 589e50e

Upload folder using huggingface_hub

Browse files
Files changed (7) hide show
  1. README.md +74 -7
  2. app.py +670 -0
  3. compatibility.py +321 -0
  4. config_generator.py +328 -0
  5. model_info.py +307 -0
  6. notebook_generator.py +484 -0
  7. requirements.txt +5 -0
README.md CHANGED
@@ -1,12 +1,79 @@
1
  ---
2
  title: Forgekit
3
- emoji: 🐠
4
- colorFrom: gray
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 6.5.1
8
  app_file: app.py
9
- pinned: false
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Forgekit
 
 
 
 
 
3
  app_file: app.py
4
+ sdk: gradio
5
+ sdk_version: 5.42.0
6
  ---
7
 
8
+ # πŸ”₯ ForgeKit
9
+
10
+ **Forge your perfect AI model β€” no code required.**
11
+
12
+ ForgeKit is an open-source platform that lets anyone create custom AI models by merging existing ones. No coding, no complex setup β€” just pick your models, configure the merge, and get a ready-to-run Colab notebook.
13
+
14
+ ## ✨ Features
15
+
16
+ ### βš’οΈ Merge Builder
17
+ - Add models by ID and instantly check architecture compatibility
18
+ - Choose from 6 merge methods: DARE-TIES, TIES, SLERP, Linear, Task Arithmetic, Passthrough
19
+ - Adjust weights and densities with smart presets
20
+ - Auto-suggest base model and tokenizer
21
+ - Generate ready-to-run Colab notebooks with one click
22
+
23
+ ### πŸ” Model Explorer
24
+ - Search HuggingFace Hub for models
25
+ - Filter by architecture type
26
+ - View detailed model specs (hidden size, layers, vocab, etc.)
27
+
28
+ ### πŸ“¦ GGUF Quantizer
29
+ - Convert any HF model to GGUF format
30
+ - Multiple quantization levels (Q8_0, Q5_K_M, Q4_K_M, etc.)
31
+ - Ready-to-run Colab notebook generation
32
+
33
+ ### πŸš€ Deploy
34
+ - Generate deployment files for HuggingFace Spaces
35
+ - Gradio chat interface or Docker + llama.cpp options
36
+ - Auto-generated app.py and README
37
+
38
+ ### πŸ† Community Leaderboard
39
+ - Browse community-created merges
40
+ - Submit your own merged models
41
+ - Discover popular merge recipes
42
+
43
+ ## πŸ› οΈ Supported Merge Methods
44
+
45
+ | Method | Models | Best For |
46
+ |--------|--------|----------|
47
+ | **DARE-TIES** | 2-10 | Combining specialists (coding + math) |
48
+ | **TIES** | 2-10 | Resolving parameter interference |
49
+ | **SLERP** | 2 | Smooth two-model interpolation |
50
+ | **Linear** | 2-10 | Simple weighted averaging |
51
+ | **Task Arithmetic** | 1-10 | Adding/removing capabilities |
52
+ | **Passthrough** | 1-10 | Layer stacking (Frankenmerge) |
53
+
54
+ ## πŸš€ How It Works
55
+
56
+ 1. **Add Models** β€” Enter HuggingFace model IDs
57
+ 2. **Check Compatibility** β€” ForgeKit verifies architectures match
58
+ 3. **Configure** β€” Choose method, adjust weights, pick presets
59
+ 4. **Generate** β€” Get a Colab notebook with everything pre-filled
60
+ 5. **Run** β€” Open in Colab, click Run All, wait for your model
61
+ 6. **Ship** β€” Auto-upload to HF Hub + optional GGUF + Space deployment
62
+
63
+ ## πŸ“‹ Requirements
64
+
65
+ The generated Colab notebooks handle all dependencies. You just need:
66
+ - A Google account (for Colab)
67
+ - A HuggingFace account (for model access and upload)
68
+ - A HF token (for gated models and uploading)
69
+
70
+ ## πŸ§‘β€πŸ’» Built By
71
+
72
+ **[AIencoder](https://huggingface.co/AIencoder)** β€” AI/ML Engineer
73
+
74
+ - [Portfolio](https://aiencoder-portfolio.static.hf.space)
75
+ - [GitHub](https://github.com/Ary5272)
76
+
77
+ ## πŸ“„ License
78
+
79
+ MIT β€” use it, fork it, improve it.
app.py ADDED
@@ -0,0 +1,670 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """ForgeKit β€” Forge your perfect AI model, no code required.
2
+
3
+ Main Gradio application with 5 tabs:
4
+ 1. Merge Builder β€” Visual merge configuration + notebook generation
5
+ 2. Model Explorer β€” Search and discover HF models
6
+ 3. GGUF Quantizer β€” Generate quantization notebooks
7
+ 4. Deploy β€” Generate deployment files for HF Spaces
8
+ 5. Leaderboard β€” Community merge rankings
9
+ """
10
+
11
+ import gradio as gr
12
+ import json
13
+ import tempfile
14
+ import os
15
+
16
+ from forgekit.model_info import fetch_model_info, search_models
17
+ from forgekit.compatibility import check_compatibility, quick_check
18
+ from forgekit.config_generator import (
19
+ MergeConfig, generate_yaml, generate_from_preset,
20
+ MERGE_METHODS, PRESETS,
21
+ )
22
+ from forgekit.notebook_generator import generate_merge_notebook, save_notebook
23
+
24
+ # ===== THEME =====
25
+ theme = gr.themes.Base(
26
+ primary_hue=gr.themes.colors.amber,
27
+ secondary_hue=gr.themes.colors.purple,
28
+ neutral_hue=gr.themes.colors.gray,
29
+ font=gr.themes.GoogleFont("Inter"),
30
+ font_mono=gr.themes.GoogleFont("JetBrains Mono"),
31
+ ).set(
32
+ body_background_fill="#0a0a0f",
33
+ body_background_fill_dark="#0a0a0f",
34
+ body_text_color="#e5e5e5",
35
+ body_text_color_dark="#e5e5e5",
36
+ block_background_fill="#111118",
37
+ block_background_fill_dark="#111118",
38
+ block_border_color="#1f1f2e",
39
+ block_border_color_dark="#1f1f2e",
40
+ block_label_text_color="#9ca3af",
41
+ block_label_text_color_dark="#9ca3af",
42
+ block_title_text_color="#e5e5e5",
43
+ block_title_text_color_dark="#e5e5e5",
44
+ input_background_fill="#16161f",
45
+ input_background_fill_dark="#16161f",
46
+ input_border_color="#2a2a3a",
47
+ input_border_color_dark="#2a2a3a",
48
+ button_primary_background_fill="linear-gradient(to right, #f59e0b, #f97316)",
49
+ button_primary_background_fill_dark="linear-gradient(to right, #f59e0b, #f97316)",
50
+ button_primary_text_color="#ffffff",
51
+ button_primary_text_color_dark="#ffffff",
52
+ button_secondary_background_fill="#1f1f2e",
53
+ button_secondary_background_fill_dark="#1f1f2e",
54
+ button_secondary_text_color="#e5e5e5",
55
+ button_secondary_text_color_dark="#e5e5e5",
56
+ )
57
+
58
+ CSS = """
59
+ .forgekit-header { text-align: center; padding: 1.5rem 0 1rem; }
60
+ .forgekit-header h1 { font-size: 2.5rem; font-weight: 800; margin: 0;
61
+ background: linear-gradient(135deg, #a855f7, #ec4899, #f59e0b);
62
+ -webkit-background-clip: text; -webkit-text-fill-color: transparent; }
63
+ .forgekit-header p { color: #9ca3af; font-size: 1rem; margin-top: 0.25rem; }
64
+ .status-ok { color: #4ade80; font-weight: 600; }
65
+ .status-warn { color: #fbbf24; font-weight: 600; }
66
+ .status-err { color: #f87171; font-weight: 600; }
67
+ .method-card { border: 1px solid #2a2a3a; border-radius: 12px; padding: 1rem; margin: 0.25rem 0; }
68
+ footer { display: none !important; }
69
+ """
70
+
71
+ # ===== CALLBACKS =====
72
+
73
+ def check_models(models_text: str, token: str) -> tuple[str, str]:
74
+ """Check model compatibility and return report + quick status."""
75
+ models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
76
+ if len(models) < 2:
77
+ return "⚠️ Add at least 2 models (one per line)", ""
78
+
79
+ tok = token.strip() if token else None
80
+ report = check_compatibility(models, token=tok)
81
+ quick = quick_check(models, token=tok)
82
+ return report.to_markdown(), quick
83
+
84
+
85
+ def generate_config(
86
+ models_text: str, method: str, base_model: str,
87
+ weights_text: str, densities_text: str,
88
+ tokenizer_src: str, dtype: str,
89
+ slerp_t: float, int8_mask: bool, normalize: bool,
90
+ ) -> str:
91
+ """Generate YAML config from UI inputs."""
92
+ models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
93
+ if not models:
94
+ return "# Add models first"
95
+
96
+ # Parse weights
97
+ weights = []
98
+ if weights_text.strip():
99
+ try:
100
+ weights = [float(w.strip()) for w in weights_text.split(",")]
101
+ except ValueError:
102
+ return "# Invalid weights β€” use comma-separated numbers"
103
+
104
+ densities = []
105
+ if densities_text.strip():
106
+ try:
107
+ densities = [float(d.strip()) for d in densities_text.split(",")]
108
+ except ValueError:
109
+ return "# Invalid densities β€” use comma-separated numbers"
110
+
111
+ config = MergeConfig(
112
+ method=method,
113
+ models=models,
114
+ base_model=base_model.strip(),
115
+ weights=weights,
116
+ densities=densities,
117
+ tokenizer_source=tokenizer_src.strip(),
118
+ dtype=dtype,
119
+ slerp_t=slerp_t,
120
+ int8_mask=int8_mask,
121
+ normalize=normalize,
122
+ )
123
+
124
+ return generate_yaml(config)
125
+
126
+
127
+ def apply_preset(preset_name: str, models_text: str) -> tuple[str, str]:
128
+ """Apply a preset and return weights + densities strings."""
129
+ models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
130
+ if not models:
131
+ return "", ""
132
+
133
+ preset = PRESETS.get(preset_name)
134
+ if not preset:
135
+ return "", ""
136
+
137
+ weights, densities = preset.apply(models)
138
+ return ", ".join(str(w) for w in weights), ", ".join(str(d) for d in densities)
139
+
140
+
141
+ def generate_notebook_file(
142
+ models_text: str, method: str, base_model: str,
143
+ weights_text: str, densities_text: str,
144
+ tokenizer_src: str, dtype: str,
145
+ slerp_t: float, int8_mask: bool, normalize: bool,
146
+ output_name: str, hf_user: str,
147
+ inc_quantize: bool, inc_deploy: bool,
148
+ quant_types_text: str,
149
+ ) -> str | None:
150
+ """Generate and save a Colab notebook, return file path."""
151
+ models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
152
+ if not models:
153
+ return None
154
+
155
+ weights = []
156
+ if weights_text.strip():
157
+ try:
158
+ weights = [float(w.strip()) for w in weights_text.split(",")]
159
+ except ValueError:
160
+ pass
161
+
162
+ densities = []
163
+ if densities_text.strip():
164
+ try:
165
+ densities = [float(d.strip()) for d in densities_text.split(",")]
166
+ except ValueError:
167
+ pass
168
+
169
+ quant_types = [q.strip() for q in quant_types_text.split(",") if q.strip()]
170
+ if not quant_types:
171
+ quant_types = ["Q5_K_M", "Q4_K_M"]
172
+
173
+ config = MergeConfig(
174
+ method=method,
175
+ models=models,
176
+ base_model=base_model.strip(),
177
+ weights=weights,
178
+ densities=densities,
179
+ tokenizer_source=tokenizer_src.strip(),
180
+ dtype=dtype,
181
+ slerp_t=slerp_t,
182
+ int8_mask=int8_mask,
183
+ normalize=normalize,
184
+ )
185
+
186
+ name = output_name.strip() or "ForgeKit-Merged-Model"
187
+ user = hf_user.strip()
188
+
189
+ nb = generate_merge_notebook(
190
+ config,
191
+ output_model_name=name,
192
+ hf_username=user,
193
+ include_quantize=inc_quantize,
194
+ include_deploy=inc_deploy,
195
+ quant_types=quant_types,
196
+ )
197
+
198
+ path = os.path.join(tempfile.gettempdir(), f"{name}_merge.ipynb")
199
+ save_notebook(nb, path)
200
+ return path
201
+
202
+
203
+ def search_hf_models(query: str, arch_filter: str, sort_by: str) -> str:
204
+ """Search HF Hub and return formatted results."""
205
+ if not query.strip():
206
+ return "Enter a search query"
207
+
208
+ results = search_models(
209
+ query=query.strip(),
210
+ architecture=arch_filter if arch_filter != "Any" else "",
211
+ limit=15,
212
+ sort=sort_by.lower(),
213
+ )
214
+
215
+ if not results:
216
+ return "No models found"
217
+
218
+ lines = ["| Model | Architecture | Downloads |", "|-------|-------------|-----------|"]
219
+ for r in results:
220
+ mid = r.get("model_id", "")
221
+ mtype = r.get("model_type", "β€”")
222
+ dl = r.get("downloads", 0)
223
+ dl_str = f"{dl:,}" if dl else "β€”"
224
+ lines.append(f"| `{mid}` | {mtype} | {dl_str} |")
225
+
226
+ return "\n".join(lines)
227
+
228
+
229
+ def fetch_model_details(model_id: str) -> str:
230
+ """Fetch and display detailed model info."""
231
+ if not model_id.strip():
232
+ return "Enter a model ID"
233
+
234
+ info = fetch_model_info(model_id.strip())
235
+ if info.error:
236
+ return f"❌ {info.error}"
237
+
238
+ return f"""### {info.model_id}
239
+
240
+ | Property | Value |
241
+ |----------|-------|
242
+ | **Architecture** | `{info.model_type}` |
243
+ | **Hidden Size** | {info.hidden_size} |
244
+ | **Layers** | {info.num_hidden_layers} |
245
+ | **Vocab Size** | {info.vocab_size:,} |
246
+ | **Intermediate** | {info.intermediate_size} |
247
+ | **Attention Heads** | {info.num_attention_heads} |
248
+ | **KV Heads** | {info.num_key_value_heads} |
249
+ | **Max Position** | {info.max_position_embeddings:,} |
250
+ | **dtype** | {info.torch_dtype} |
251
+ | **Downloads** | {info.downloads:,} |
252
+ | **Likes** | {info.likes} |
253
+ | **Params (est.)** | {info.param_estimate} |
254
+ | **RAM for merge** | {info.ram_estimate_gb} GB |
255
+ | **Gated** | {'Yes' if info.gated else 'No'} |
256
+ | **trust_remote_code** | {'Required' if info.trust_remote_code else 'No'} |"""
257
+
258
+
259
+ def suggest_base(models_text: str, token: str) -> tuple[str, str]:
260
+ """Auto-suggest base model and tokenizer from compatibility check."""
261
+ models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
262
+ if len(models) < 2:
263
+ return "", ""
264
+ tok = token.strip() if token else None
265
+ report = check_compatibility(models, token=tok)
266
+ return report.suggested_base, report.suggested_tokenizer
267
+
268
+
269
+ # ===== LEADERBOARD DATA =====
270
+ # Seeded with your existing merges
271
+ LEADERBOARD = [
272
+ {
273
+ "name": "Qwen2.5CMR-7B", "author": "AIencoder",
274
+ "method": "DARE-TIES", "base": "Qwen2.5-7B-Instruct",
275
+ "models": "Coder-7B + Math-7B", "likes": 0,
276
+ "link": "https://huggingface.co/AIencoder/Qwen2.5CMR",
277
+ },
278
+ {
279
+ "name": "Logic-Coder-7B", "author": "AIencoder",
280
+ "method": "DARE-TIES", "base": "Mistral-7B",
281
+ "models": "OpenHermes + CodeInstruct", "likes": 1,
282
+ "link": "https://huggingface.co/AIencoder/Logic-Coder-7B",
283
+ },
284
+ {
285
+ "name": "HermesMath-7B-TIES", "author": "AIencoder",
286
+ "method": "TIES", "base": "Mistral-7B",
287
+ "models": "Hermes + MetaMath", "likes": 1,
288
+ "link": "https://huggingface.co/AIencoder/HermesMath-7B-TIES",
289
+ },
290
+ {
291
+ "name": "Hermes-2-Pro-GodCoder", "author": "AIencoder",
292
+ "method": "DARE-TIES", "base": "Mistral-7B",
293
+ "models": "Hermes-2-Pro + CodeModels", "likes": 1,
294
+ "link": "https://huggingface.co/AIencoder/Hermes-2-Pro-Mistral-7B-GodCoder",
295
+ },
296
+ ]
297
+
298
+
299
+ def get_leaderboard() -> str:
300
+ """Return leaderboard as markdown table."""
301
+ lines = [
302
+ "| # | Model | Author | Method | Source Models | Likes |",
303
+ "|---|-------|--------|--------|---------------|-------|",
304
+ ]
305
+ sorted_lb = sorted(LEADERBOARD, key=lambda x: -x["likes"])
306
+ for i, entry in enumerate(sorted_lb, 1):
307
+ name = f"[{entry['name']}]({entry['link']})"
308
+ lines.append(
309
+ f"| {i} | {name} | {entry['author']} | {entry['method']} | "
310
+ f"{entry['models']} | {entry['likes']} |"
311
+ )
312
+ return "\n".join(lines)
313
+
314
+
315
+ # ============================================================
316
+ # GRADIO APP
317
+ # ============================================================
318
+
319
+ with gr.Blocks(title="ForgeKit β€” Model Merging Platform") as demo:
320
+
321
+ # ===== HEADER =====
322
+ gr.HTML("""
323
+ <div class="forgekit-header">
324
+ <h1>πŸ”₯ ForgeKit</h1>
325
+ <p>Forge your perfect AI model β€” no code required</p>
326
+ </div>
327
+ """)
328
+
329
+ with gr.Tabs():
330
+
331
+ # =====================================================
332
+ # TAB 1: MERGE BUILDER
333
+ # =====================================================
334
+ with gr.Tab("βš’οΈ Merge Builder", id="builder"):
335
+ gr.Markdown("### Build your merge configuration and generate a ready-to-run Colab notebook")
336
+
337
+ with gr.Row():
338
+ # LEFT COLUMN: Inputs
339
+ with gr.Column(scale=3):
340
+ models_input = gr.Textbox(
341
+ label="Models to Merge (one per line)",
342
+ placeholder="Qwen/Qwen2.5-Coder-7B-Instruct\nQwen/Qwen2.5-Math-7B-Instruct",
343
+ lines=5,
344
+ )
345
+ hf_token = gr.Textbox(
346
+ label="HF Token (optional β€” for gated models)",
347
+ type="password",
348
+ placeholder="hf_...",
349
+ )
350
+
351
+ with gr.Row():
352
+ check_btn = gr.Button("πŸ” Check Compatibility", variant="secondary")
353
+ suggest_btn = gr.Button("πŸ’‘ Auto-Suggest Base", variant="secondary")
354
+
355
+ compat_status = gr.Textbox(label="Quick Status", interactive=False, max_lines=2)
356
+ compat_report = gr.Markdown(label="Compatibility Report")
357
+
358
+ # RIGHT COLUMN: Configuration
359
+ with gr.Column(scale=3):
360
+ method_dd = gr.Dropdown(
361
+ choices=list(MERGE_METHODS.keys()),
362
+ value="dare_ties",
363
+ label="Merge Method",
364
+ )
365
+ method_info_md = gr.Markdown(
366
+ value=f"**DARE-TIES** β€” {MERGE_METHODS['dare_ties']['description']}"
367
+ )
368
+ base_model = gr.Textbox(
369
+ label="Base Model",
370
+ placeholder="Qwen/Qwen2.5-7B-Instruct",
371
+ )
372
+ tokenizer_src = gr.Textbox(
373
+ label="Tokenizer Source",
374
+ placeholder="Same as base model (leave blank to auto-fill)",
375
+ )
376
+
377
+ with gr.Row():
378
+ weights_input = gr.Textbox(label="Weights (comma-separated)", placeholder="0.5, 0.5")
379
+ densities_input = gr.Textbox(label="Densities (comma-separated)", placeholder="0.7, 0.6")
380
+
381
+ with gr.Row():
382
+ preset_dd = gr.Dropdown(
383
+ choices=list(PRESETS.keys()),
384
+ label="Apply Preset",
385
+ scale=2,
386
+ )
387
+ preset_btn = gr.Button("Apply", variant="secondary", scale=1)
388
+
389
+ with gr.Row():
390
+ dtype_dd = gr.Dropdown(choices=["bfloat16", "float16", "float32"], value="bfloat16", label="dtype")
391
+ slerp_t = gr.Slider(0, 1, value=0.5, step=0.05, label="SLERP t", visible=False)
392
+
393
+ with gr.Row():
394
+ int8_mask = gr.Checkbox(label="int8_mask", value=True)
395
+ normalize_cb = gr.Checkbox(label="normalize", value=True)
396
+
397
+ gr.Markdown("---")
398
+ gr.Markdown("### Output")
399
+
400
+ with gr.Row():
401
+ with gr.Column(scale=3):
402
+ yaml_output = gr.Code(label="Generated YAML Config", language="yaml", lines=15)
403
+ gen_yaml_btn = gr.Button("πŸ“‹ Generate YAML", variant="primary", size="lg")
404
+
405
+ with gr.Column(scale=3):
406
+ gr.Markdown("#### Notebook Settings")
407
+ output_name = gr.Textbox(label="Model Name", placeholder="My-Merged-7B")
408
+ hf_username = gr.Textbox(label="HF Username", placeholder="AIencoder")
409
+ with gr.Row():
410
+ inc_quant = gr.Checkbox(label="Include GGUF Quantization", value=True)
411
+ inc_deploy = gr.Checkbox(label="Include HF Deployment", value=True)
412
+ quant_types = gr.Textbox(label="Quant Types", value="Q5_K_M, Q4_K_M")
413
+ gen_nb_btn = gr.Button("πŸš€ Generate Colab Notebook", variant="primary", size="lg")
414
+ nb_file = gr.File(label="Download Notebook")
415
+
416
+ # === EVENTS ===
417
+ check_btn.click(
418
+ check_models, [models_input, hf_token], [compat_report, compat_status]
419
+ )
420
+ suggest_btn.click(
421
+ suggest_base, [models_input, hf_token], [base_model, tokenizer_src]
422
+ )
423
+ preset_btn.click(
424
+ apply_preset, [preset_dd, models_input], [weights_input, densities_input]
425
+ )
426
+ gen_yaml_btn.click(
427
+ generate_config,
428
+ [models_input, method_dd, base_model, weights_input, densities_input,
429
+ tokenizer_src, dtype_dd, slerp_t, int8_mask, normalize_cb],
430
+ yaml_output,
431
+ )
432
+ gen_nb_btn.click(
433
+ generate_notebook_file,
434
+ [models_input, method_dd, base_model, weights_input, densities_input,
435
+ tokenizer_src, dtype_dd, slerp_t, int8_mask, normalize_cb,
436
+ output_name, hf_username, inc_quant, inc_deploy, quant_types],
437
+ nb_file,
438
+ )
439
+
440
+ # Method change: show/hide SLERP slider + update description
441
+ def on_method_change(m):
442
+ info = MERGE_METHODS.get(m, {})
443
+ desc = f"**{info.get('name', m)}** β€” {info.get('description', '')}"
444
+ show_slerp = m == "slerp"
445
+ return desc, gr.update(visible=show_slerp)
446
+
447
+ method_dd.change(on_method_change, method_dd, [method_info_md, slerp_t])
448
+
449
+ # =====================================================
450
+ # TAB 2: MODEL EXPLORER
451
+ # =====================================================
452
+ with gr.Tab("πŸ” Model Explorer", id="explorer"):
453
+ gr.Markdown("### Search and discover models on HuggingFace Hub")
454
+
455
+ with gr.Row():
456
+ search_query = gr.Textbox(label="Search", placeholder="qwen coder instruct", scale=3)
457
+ arch_filter = gr.Dropdown(
458
+ choices=["Any", "llama", "qwen2", "mistral", "gemma2", "phi3", "starcoder2"],
459
+ value="Any", label="Architecture", scale=1,
460
+ )
461
+ sort_dd = gr.Dropdown(choices=["Downloads", "Likes", "Modified"], value="Downloads", label="Sort", scale=1)
462
+ search_btn = gr.Button("πŸ” Search", variant="primary", scale=1)
463
+
464
+ search_results = gr.Markdown(label="Results")
465
+
466
+ gr.Markdown("---")
467
+ gr.Markdown("### Model Details")
468
+ with gr.Row():
469
+ detail_input = gr.Textbox(label="Model ID", placeholder="Qwen/Qwen2.5-Coder-7B-Instruct", scale=3)
470
+ detail_btn = gr.Button("πŸ“‹ Fetch Details", variant="secondary", scale=1)
471
+ detail_output = gr.Markdown()
472
+
473
+ search_btn.click(search_hf_models, [search_query, arch_filter, sort_dd], search_results)
474
+ detail_btn.click(fetch_model_details, detail_input, detail_output)
475
+
476
+ # =====================================================
477
+ # TAB 3: GGUF QUANTIZER
478
+ # =====================================================
479
+ with gr.Tab("πŸ“¦ GGUF Quantizer", id="quantizer"):
480
+ gr.Markdown("""### Generate a quantization notebook for any HF model
481
+ Convert any HuggingFace model to GGUF format for use with llama.cpp, Ollama, LM Studio, etc.""")
482
+
483
+ q_model = gr.Textbox(label="Model ID", placeholder="AIencoder/Qwen2.5CMR-7B")
484
+ q_username = gr.Textbox(label="Your HF Username", placeholder="AIencoder")
485
+
486
+ gr.Markdown("#### Quantization Levels")
487
+ gr.Markdown("""
488
+ | Type | Size (7B) | Quality | Best For |
489
+ |------|----------|---------|----------|
490
+ | Q8_0 | ~7.5 GB | Best | Maximum quality |
491
+ | Q6_K | ~5.5 GB | Great | Good balance |
492
+ | **Q5_K_M** | **~5 GB** | **Good** | **Recommended** |
493
+ | Q4_K_M | ~4 GB | Decent | Memory-constrained |
494
+ | IQ4_XS | ~3.5 GB | Fair | Extreme compression |
495
+ """)
496
+ q_types = gr.Textbox(label="Quant Types (comma-separated)", value="Q8_0, Q5_K_M, Q4_K_M")
497
+
498
+ q_btn = gr.Button("πŸ“¦ Generate Quantization Notebook", variant="primary", size="lg")
499
+ q_file = gr.File(label="Download Notebook")
500
+
501
+ def gen_quant_notebook(model_id, username, qtypes_text):
502
+ if not model_id.strip():
503
+ return None
504
+ qtypes = [q.strip() for q in qtypes_text.split(",") if q.strip()]
505
+ name = model_id.strip().split("/")[-1]
506
+ config = MergeConfig(method="linear", models=[model_id.strip()])
507
+ nb = generate_merge_notebook(
508
+ config,
509
+ output_model_name=name,
510
+ hf_username=username.strip(),
511
+ include_quantize=True,
512
+ include_deploy=False,
513
+ quant_types=qtypes,
514
+ )
515
+ # Remove merge cells, keep only setup + quantize
516
+ path = os.path.join(tempfile.gettempdir(), f"{name}_quantize.ipynb")
517
+ save_notebook(nb, path)
518
+ return path
519
+
520
+ q_btn.click(gen_quant_notebook, [q_model, q_username, q_types], q_file)
521
+
522
+ # =====================================================
523
+ # TAB 4: DEPLOY
524
+ # =====================================================
525
+ with gr.Tab("πŸš€ Deploy", id="deploy"):
526
+ gr.Markdown("""### Deploy your merged model to a HuggingFace Space
527
+
528
+ After merging and (optionally) quantizing, deploy a chat interface for your model.""")
529
+
530
+ d_model = gr.Textbox(label="Model Repo ID", placeholder="AIencoder/Qwen2.5CMR-7B")
531
+ d_type = gr.Dropdown(
532
+ choices=["Gradio Chat (transformers)", "Docker + llama.cpp (GGUF)"],
533
+ value="Gradio Chat (transformers)", label="Deployment Type",
534
+ )
535
+ d_btn = gr.Button("πŸ“‹ Generate Deployment Files", variant="primary")
536
+ d_output = gr.Code(label="app.py", language="python", lines=20)
537
+ d_readme = gr.Code(label="README.md (Space metadata)", language="markdown", lines=8)
538
+
539
+ def gen_deploy(model_id, deploy_type):
540
+ mid = model_id.strip()
541
+ if not mid:
542
+ return "# Enter a model ID first", ""
543
+
544
+ if "Gradio" in deploy_type:
545
+ app = f'''import gradio as gr
546
+ from transformers import AutoTokenizer, AutoModelForCausalLM, TextIteratorStreamer
547
+ import torch
548
+ from threading import Thread
549
+
550
+ MODEL_ID = "{mid}"
551
+
552
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
553
+ model = AutoModelForCausalLM.from_pretrained(
554
+ MODEL_ID, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
555
+ )
556
+
557
+ def chat(message, history):
558
+ messages = []
559
+ for h in history:
560
+ messages.append({{"role": "user", "content": h[0]}})
561
+ if h[1]:
562
+ messages.append({{"role": "assistant", "content": h[1]}})
563
+ messages.append({{"role": "user", "content": message}})
564
+
565
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
566
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
567
+ streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
568
+
569
+ thread = Thread(target=model.generate, kwargs={{
570
+ **inputs, "max_new_tokens": 512, "streamer": streamer,
571
+ "do_sample": True, "temperature": 0.7,
572
+ }})
573
+ thread.start()
574
+
575
+ response = ""
576
+ for token in streamer:
577
+ response += token
578
+ yield response
579
+
580
+ demo = gr.ChatInterface(chat, title="{mid.split('/')[-1]}", description="Merged with ForgeKit")
581
+ demo.launch()'''
582
+ readme = f"""---
583
+ title: {mid.split('/')[-1]} Chat
584
+ emoji: πŸ”₯
585
+ colorFrom: amber
586
+ colorTo: orange
587
+ sdk: gradio
588
+ sdk_version: 5.12.0
589
+ app_file: app.py
590
+ pinned: false
591
+ license: apache-2.0
592
+ ---"""
593
+ else:
594
+ app = f'''# Docker deployment with llama.cpp
595
+ # Dockerfile for serving GGUF models
596
+
597
+ FROM ghcr.io/ggerganov/llama.cpp:server
598
+
599
+ # Download the GGUF model
600
+ ADD https://huggingface.co/{mid}/resolve/main/*Q5_K_M*.gguf /models/model.gguf
601
+
602
+ EXPOSE 7860
603
+
604
+ CMD ["/llama-server", \\
605
+ "--model", "/models/model.gguf", \\
606
+ "--host", "0.0.0.0", \\
607
+ "--port", "7860", \\
608
+ "--ctx-size", "4096", \\
609
+ "--n-gpu-layers", "99"]'''
610
+ readme = f"""---
611
+ title: {mid.split('/')[-1]}
612
+ emoji: πŸ”₯
613
+ colorFrom: amber
614
+ colorTo: orange
615
+ sdk: docker
616
+ pinned: false
617
+ license: apache-2.0
618
+ ---"""
619
+
620
+ return app, readme
621
+
622
+ d_btn.click(gen_deploy, [d_model, d_type], [d_output, d_readme])
623
+
624
+ # =====================================================
625
+ # TAB 5: LEADERBOARD
626
+ # =====================================================
627
+ with gr.Tab("πŸ† Leaderboard", id="leaderboard"):
628
+ gr.Markdown("""### Community Merge Leaderboard
629
+ See what others have built with ForgeKit. Submit your own merge to get featured!""")
630
+
631
+ lb_md = gr.Markdown(value=get_leaderboard())
632
+ lb_refresh = gr.Button("πŸ”„ Refresh", variant="secondary")
633
+ lb_refresh.click(lambda: get_leaderboard(), outputs=lb_md)
634
+
635
+ gr.Markdown("---")
636
+ gr.Markdown("### Submit Your Merge")
637
+ with gr.Row():
638
+ sub_name = gr.Textbox(label="Model Name", placeholder="My-Awesome-Merge-7B")
639
+ sub_author = gr.Textbox(label="Author", placeholder="Your HF username")
640
+ sub_method = gr.Textbox(label="Merge Method", placeholder="DARE-TIES")
641
+ with gr.Row():
642
+ sub_models = gr.Textbox(label="Source Models (short)", placeholder="Coder-7B + Math-7B")
643
+ sub_link = gr.Textbox(label="HF Model Link", placeholder="https://huggingface.co/...")
644
+ sub_btn = gr.Button("πŸ“€ Submit", variant="primary")
645
+ sub_status = gr.Markdown()
646
+
647
+ def submit_merge(name, author, method, models, link):
648
+ if not all([name, author, method, models, link]):
649
+ return "⚠️ Please fill in all fields"
650
+ LEADERBOARD.append({
651
+ "name": name, "author": author, "method": method,
652
+ "base": "", "models": models, "likes": 0, "link": link,
653
+ })
654
+ return f"βœ… **{name}** submitted! It will appear on the leaderboard."
655
+
656
+ sub_btn.click(submit_merge, [sub_name, sub_author, sub_method, sub_models, sub_link], sub_status)
657
+
658
+ # ===== FOOTER =====
659
+ gr.Markdown("""
660
+ ---
661
+ <center>
662
+
663
+ **ForgeKit** v0.1.0 β€” Built by [AIencoder](https://huggingface.co/AIencoder) | [Portfolio](https://aiencoder-portfolio.static.hf.space) | [GitHub](https://github.com/Ary5272)
664
+
665
+ </center>
666
+ """)
667
+
668
+
669
+ if __name__ == "__main__":
670
+ demo.launch(theme=theme, css=CSS)
compatibility.py ADDED
@@ -0,0 +1,321 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Architecture compatibility checker for model merging."""
2
+
3
+ from dataclasses import dataclass, field
4
+ from typing import Optional
5
+ from .model_info import ModelInfo, fetch_model_info
6
+
7
+
8
+ @dataclass
9
+ class CompatibilityReport:
10
+ """Result of compatibility checking between models."""
11
+ compatible: bool = True
12
+ errors: list[str] = field(default_factory=list)
13
+ warnings: list[str] = field(default_factory=list)
14
+ suggestions: list[str] = field(default_factory=list)
15
+ models_info: list[ModelInfo] = field(default_factory=list)
16
+ suggested_base: str = ""
17
+ suggested_tokenizer: str = ""
18
+ architecture: str = ""
19
+ merge_methods_available: list[str] = field(default_factory=list)
20
+ estimated_ram_gb: float = 0.0
21
+ estimated_merge_time: str = ""
22
+
23
+ @property
24
+ def status_emoji(self) -> str:
25
+ if not self.compatible:
26
+ return "❌"
27
+ elif self.warnings:
28
+ return "⚠️"
29
+ return "βœ…"
30
+
31
+ @property
32
+ def status_text(self) -> str:
33
+ if not self.compatible:
34
+ return "Incompatible β€” cannot merge"
35
+ elif self.warnings:
36
+ return "Compatible with warnings"
37
+ return "Fully compatible"
38
+
39
+ def to_markdown(self) -> str:
40
+ """Generate a formatted markdown report."""
41
+ lines = []
42
+
43
+ # Header
44
+ lines.append(f"## {self.status_emoji} Compatibility Report")
45
+ lines.append("")
46
+
47
+ if self.architecture:
48
+ lines.append(f"**Architecture:** `{self.architecture}`")
49
+ lines.append("")
50
+
51
+ # Errors
52
+ if self.errors:
53
+ lines.append("### ❌ Errors")
54
+ for e in self.errors:
55
+ lines.append(f"- {e}")
56
+ lines.append("")
57
+
58
+ # Warnings
59
+ if self.warnings:
60
+ lines.append("### ⚠️ Warnings")
61
+ for w in self.warnings:
62
+ lines.append(f"- {w}")
63
+ lines.append("")
64
+
65
+ # Model details table
66
+ if self.models_info:
67
+ lines.append("### Model Details")
68
+ lines.append("| Model | Type | Hidden | Layers | Vocab | Params |")
69
+ lines.append("|-------|------|--------|--------|-------|--------|")
70
+ for m in self.models_info:
71
+ name = m.display_name
72
+ if len(name) > 35:
73
+ name = name[:32] + "..."
74
+ lines.append(
75
+ f"| {name} | `{m.model_type}` | {m.hidden_size} | "
76
+ f"{m.num_hidden_layers} | {m.vocab_size} | {m.param_estimate} |"
77
+ )
78
+ lines.append("")
79
+
80
+ # Suggestions
81
+ if self.suggestions:
82
+ lines.append("### πŸ’‘ Suggestions")
83
+ for s in self.suggestions:
84
+ lines.append(f"- {s}")
85
+ lines.append("")
86
+
87
+ # Merge methods
88
+ if self.merge_methods_available:
89
+ methods = ", ".join(f"`{m}`" for m in self.merge_methods_available)
90
+ lines.append(f"**Available merge methods:** {methods}")
91
+ lines.append("")
92
+
93
+ # Resource estimates
94
+ if self.estimated_ram_gb > 0:
95
+ lines.append(f"**Estimated RAM:** {self.estimated_ram_gb} GB")
96
+ lines.append(f"**Estimated time:** {self.estimated_merge_time}")
97
+ colab_tier = "Standard" if self.estimated_ram_gb <= 12 else "High-RAM" if self.estimated_ram_gb <= 48 else "A100 (Colab Pro+)"
98
+ lines.append(f"**Recommended Colab:** {colab_tier}")
99
+ lines.append("")
100
+
101
+ if self.suggested_base:
102
+ lines.append(f"**Suggested base model:** `{self.suggested_base}`")
103
+ if self.suggested_tokenizer:
104
+ lines.append(f"**Suggested tokenizer:** `{self.suggested_tokenizer}`")
105
+
106
+ return "\n".join(lines)
107
+
108
+
109
+ def check_compatibility(
110
+ model_ids: list[str],
111
+ token: Optional[str] = None,
112
+ ) -> CompatibilityReport:
113
+ """Check if a list of models are compatible for merging.
114
+
115
+ Args:
116
+ model_ids: List of HuggingFace model IDs
117
+ token: Optional HF API token for gated models
118
+
119
+ Returns:
120
+ CompatibilityReport with detailed analysis
121
+ """
122
+ report = CompatibilityReport()
123
+
124
+ # Validate input
125
+ if len(model_ids) < 2:
126
+ report.compatible = False
127
+ report.errors.append("At least 2 models are required for merging.")
128
+ return report
129
+
130
+ if len(model_ids) > 10:
131
+ report.warnings.append("Merging more than 10 models is unusual and may produce poor results.")
132
+
133
+ # Fetch all model info
134
+ for mid in model_ids:
135
+ mid = mid.strip()
136
+ if not mid:
137
+ continue
138
+ info = fetch_model_info(mid, token=token)
139
+ report.models_info.append(info)
140
+
141
+ if info.error:
142
+ if info.gated:
143
+ report.warnings.append(f"`{mid}`: Gated model β€” provide HF token to verify compatibility")
144
+ else:
145
+ report.compatible = False
146
+ report.errors.append(f"`{mid}`: {info.error}")
147
+
148
+ # If we couldn't fetch any models, bail
149
+ valid_models = [m for m in report.models_info if not m.error]
150
+ if len(valid_models) < 2:
151
+ report.compatible = False
152
+ if not report.errors:
153
+ report.errors.append("Could not fetch enough model configs to verify compatibility.")
154
+ return report
155
+
156
+ # === ARCHITECTURE CHECKS ===
157
+
158
+ # 1. model_type must match
159
+ types = set(m.model_type for m in valid_models)
160
+ if len(types) > 1:
161
+ report.compatible = False
162
+ report.errors.append(
163
+ f"Architecture mismatch! Found: {', '.join(f'`{t}`' for t in types)}. "
164
+ f"All models must share the same architecture to merge."
165
+ )
166
+ return report
167
+
168
+ report.architecture = valid_models[0].model_type
169
+
170
+ # 2. hidden_size must match
171
+ hidden_sizes = set(m.hidden_size for m in valid_models if m.hidden_size > 0)
172
+ if len(hidden_sizes) > 1:
173
+ report.compatible = False
174
+ report.errors.append(
175
+ f"Hidden size mismatch: {', '.join(str(s) for s in hidden_sizes)}. "
176
+ f"Models must have the same hidden dimension."
177
+ )
178
+
179
+ # 3. intermediate_size must match (for most methods)
180
+ inter_sizes = set(m.intermediate_size for m in valid_models if m.intermediate_size > 0)
181
+ if len(inter_sizes) > 1:
182
+ report.compatible = False
183
+ report.errors.append(
184
+ f"Intermediate size mismatch: {', '.join(str(s) for s in inter_sizes)}. "
185
+ f"Required for DARE-TIES, SLERP, and Linear methods."
186
+ )
187
+
188
+ # 4. num_hidden_layers β€” warn if different
189
+ layer_counts = set(m.num_hidden_layers for m in valid_models if m.num_hidden_layers > 0)
190
+ if len(layer_counts) > 1:
191
+ report.warnings.append(
192
+ f"Layer count differs: {', '.join(str(l) for l in layer_counts)}. "
193
+ f"Passthrough/Frankenmerge can handle this, but DARE-TIES/SLERP/Linear require matching layers."
194
+ )
195
+
196
+ # 5. vocab_size β€” warn if different
197
+ vocab_sizes = set(m.vocab_size for m in valid_models if m.vocab_size > 0)
198
+ if len(vocab_sizes) > 1:
199
+ report.warnings.append(
200
+ f"Vocabulary size differs: {', '.join(str(v) for v in vocab_sizes)}. "
201
+ f"Use `tokenizer_source` to specify which tokenizer to keep."
202
+ )
203
+
204
+ # 6. num_attention_heads / num_key_value_heads
205
+ head_counts = set(m.num_attention_heads for m in valid_models if m.num_attention_heads > 0)
206
+ kv_head_counts = set(m.num_key_value_heads for m in valid_models if m.num_key_value_heads > 0)
207
+ if len(head_counts) > 1:
208
+ report.compatible = False
209
+ report.errors.append(
210
+ f"Attention head count mismatch: {', '.join(str(h) for h in head_counts)}."
211
+ )
212
+ if len(kv_head_counts) > 1:
213
+ report.warnings.append(
214
+ f"KV head count differs: {', '.join(str(h) for h in kv_head_counts)}. "
215
+ f"This may cause issues with GQA models."
216
+ )
217
+
218
+ # 7. trust_remote_code warning
219
+ needs_trust = [m.model_id for m in valid_models if m.trust_remote_code]
220
+ if needs_trust:
221
+ report.warnings.append(
222
+ f"Models requiring `trust_remote_code=True`: "
223
+ f"{', '.join(f'`{m}`' for m in needs_trust)}"
224
+ )
225
+
226
+ # === SUGGESTIONS ===
227
+
228
+ # Suggest base model (most downloaded or original base if detectable)
229
+ if valid_models:
230
+ # Prefer instruct/base versions, then most downloaded
231
+ base_candidates = sorted(
232
+ valid_models,
233
+ key=lambda m: (
234
+ "instruct" in m.model_id.lower() and "code" not in m.model_id.lower(),
235
+ -m.downloads,
236
+ ),
237
+ )
238
+ report.suggested_base = base_candidates[0].model_id
239
+ report.suggestions.append(f"Use `{report.suggested_base}` as the base model")
240
+
241
+ # Suggest tokenizer source (largest vocab)
242
+ if vocab_sizes and len(vocab_sizes) > 1:
243
+ largest_vocab_model = max(valid_models, key=lambda m: m.vocab_size)
244
+ report.suggested_tokenizer = largest_vocab_model.model_id
245
+ report.suggestions.append(
246
+ f"Use tokenizer from `{report.suggested_tokenizer}` (largest vocab: {largest_vocab_model.vocab_size})"
247
+ )
248
+ elif valid_models:
249
+ report.suggested_tokenizer = report.suggested_base
250
+
251
+ # === AVAILABLE MERGE METHODS ===
252
+ n = len(valid_models)
253
+ methods = []
254
+
255
+ if report.compatible:
256
+ # Linear always works if architectures match
257
+ methods.append("linear")
258
+
259
+ # DARE-TIES needs matching layers
260
+ if len(layer_counts) <= 1:
261
+ methods.append("dare_ties")
262
+ methods.append("ties")
263
+
264
+ # SLERP only for 2 models
265
+ if n == 2 and len(layer_counts) <= 1:
266
+ methods.append("slerp")
267
+
268
+ # Task arithmetic needs a base
269
+ methods.append("task_arithmetic")
270
+
271
+ # Passthrough works even with different layer counts
272
+ methods.append("passthrough")
273
+
274
+ report.merge_methods_available = methods
275
+
276
+ # === RESOURCE ESTIMATES ===
277
+ max_size = max((m.size_bytes for m in valid_models if m.size_bytes > 0), default=0)
278
+ if max_size > 0:
279
+ # Merging needs roughly: all models loaded + output
280
+ total_model_bytes = sum(m.size_bytes for m in valid_models if m.size_bytes > 0)
281
+ # Rule of thumb: need models + 50% overhead
282
+ ram_needed = (total_model_bytes + max_size) * 1.3
283
+ report.estimated_ram_gb = round(ram_needed / (1024**3), 1)
284
+
285
+ # Time estimate based on total size
286
+ total_gb = total_model_bytes / (1024**3)
287
+ if total_gb < 10:
288
+ report.estimated_merge_time = "5-15 minutes"
289
+ elif total_gb < 30:
290
+ report.estimated_merge_time = "15-30 minutes"
291
+ elif total_gb < 60:
292
+ report.estimated_merge_time = "30-60 minutes"
293
+ else:
294
+ report.estimated_merge_time = "1-2+ hours"
295
+
296
+ return report
297
+
298
+
299
+ def quick_check(model_ids: list[str], token: Optional[str] = None) -> str:
300
+ """Quick one-line compatibility check.
301
+
302
+ Returns a formatted string like:
303
+ "βœ… Compatible (qwen2) | 3 models | ~32GB RAM | DARE-TIES, SLERP, Linear"
304
+ """
305
+ report = check_compatibility(model_ids, token=token)
306
+
307
+ if not report.compatible:
308
+ errors = "; ".join(report.errors[:2])
309
+ return f"❌ {errors}"
310
+
311
+ methods = ", ".join(report.merge_methods_available[:3])
312
+ parts = [
313
+ f"{report.status_emoji} {report.status_text}",
314
+ f"Architecture: {report.architecture}",
315
+ f"{len(report.models_info)} models",
316
+ ]
317
+ if report.estimated_ram_gb > 0:
318
+ parts.append(f"~{report.estimated_ram_gb}GB RAM")
319
+ parts.append(f"Methods: {methods}")
320
+
321
+ return " | ".join(parts)
config_generator.py ADDED
@@ -0,0 +1,328 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Merge configuration YAML generator with presets and validation."""
2
+
3
+ from dataclasses import dataclass, field
4
+ from typing import Optional
5
+ import yaml
6
+
7
+
8
+ # ===== MERGE METHOD DEFINITIONS =====
9
+
10
+ MERGE_METHODS = {
11
+ "dare_ties": {
12
+ "name": "DARE-TIES",
13
+ "description": "Drop And REscale with TIES β€” trims low-magnitude parameters and resolves sign conflicts. Best for combining 2+ specialist models.",
14
+ "min_models": 2,
15
+ "max_models": 10,
16
+ "needs_base": True,
17
+ "params": ["weight", "density"],
18
+ "global_params": ["int8_mask", "normalize"],
19
+ "supports_slices": True,
20
+ },
21
+ "ties": {
22
+ "name": "TIES",
23
+ "description": "Trim, Elect Sign, Merge β€” resolves parameter interference between models. Similar to DARE-TIES but without the drop step.",
24
+ "min_models": 2,
25
+ "max_models": 10,
26
+ "needs_base": True,
27
+ "params": ["weight", "density"],
28
+ "global_params": ["int8_mask", "normalize"],
29
+ "supports_slices": True,
30
+ },
31
+ "slerp": {
32
+ "name": "SLERP",
33
+ "description": "Spherical Linear Interpolation β€” smoothly blends two models along a curved path in weight space. Best for two-model merges.",
34
+ "min_models": 2,
35
+ "max_models": 2,
36
+ "needs_base": False,
37
+ "params": [],
38
+ "global_params": ["t"],
39
+ "supports_slices": True,
40
+ },
41
+ "linear": {
42
+ "name": "Linear",
43
+ "description": "Simple weighted average of model parameters. Fast and predictable baseline.",
44
+ "min_models": 2,
45
+ "max_models": 10,
46
+ "needs_base": False,
47
+ "params": ["weight"],
48
+ "global_params": ["normalize"],
49
+ "supports_slices": True,
50
+ },
51
+ "task_arithmetic": {
52
+ "name": "Task Arithmetic",
53
+ "description": "Add or subtract task vectors from a base model. Use negative weights to remove capabilities.",
54
+ "min_models": 1,
55
+ "max_models": 10,
56
+ "needs_base": True,
57
+ "params": ["weight"],
58
+ "global_params": [],
59
+ "supports_slices": False,
60
+ },
61
+ "passthrough": {
62
+ "name": "Passthrough (Frankenmerge)",
63
+ "description": "Stack layers from different models. Can create larger models from smaller ones. Supports different layer counts.",
64
+ "min_models": 1,
65
+ "max_models": 10,
66
+ "needs_base": False,
67
+ "params": [],
68
+ "global_params": [],
69
+ "supports_slices": True,
70
+ "requires_slices": True,
71
+ },
72
+ }
73
+
74
+
75
+ # ===== PRESETS =====
76
+
77
+ @dataclass
78
+ class MergePreset:
79
+ name: str
80
+ description: str
81
+ method: str
82
+ weight_strategy: str # "equal", "first_dominant", "last_dominant", "auto_detect"
83
+
84
+ def apply(self, model_ids: list[str]) -> tuple[list[float], list[float]]:
85
+ """Generate weights and densities for given models."""
86
+ n = len(model_ids)
87
+ if n == 0:
88
+ return [], []
89
+
90
+ if self.weight_strategy == "equal":
91
+ weights = [round(1.0 / n, 3)] * n
92
+ densities = [0.6] * n
93
+
94
+ elif self.weight_strategy == "first_dominant":
95
+ weights = [0.6] + [round(0.4 / (n - 1), 3)] * (n - 1) if n > 1 else [1.0]
96
+ densities = [0.7] + [0.5] * (n - 1)
97
+
98
+ elif self.weight_strategy == "last_dominant":
99
+ weights = [round(0.4 / (n - 1), 3)] * (n - 1) + [0.6] if n > 1 else [1.0]
100
+ densities = [0.5] * (n - 1) + [0.7]
101
+
102
+ elif self.weight_strategy == "auto_detect":
103
+ weights, densities = _auto_detect_weights(model_ids)
104
+
105
+ else:
106
+ weights = [round(1.0 / n, 3)] * n
107
+ densities = [0.6] * n
108
+
109
+ return weights, densities
110
+
111
+
112
+ def _auto_detect_weights(model_ids: list[str]) -> tuple[list[float], list[float]]:
113
+ """Auto-detect optimal weights based on model names/tags."""
114
+ n = len(model_ids)
115
+ weights = []
116
+ densities = []
117
+
118
+ for mid in model_ids:
119
+ name = mid.lower()
120
+ if "code" in name or "coder" in name:
121
+ weights.append(0.5)
122
+ densities.append(0.7)
123
+ elif "math" in name:
124
+ weights.append(0.4)
125
+ densities.append(0.6)
126
+ elif "instruct" in name and "code" not in name:
127
+ weights.append(0.3)
128
+ densities.append(0.5)
129
+ else:
130
+ weights.append(0.3)
131
+ densities.append(0.5)
132
+
133
+ # Normalize weights to sum to 1
134
+ total = sum(weights)
135
+ if total > 0:
136
+ weights = [round(w / total, 3) for w in weights]
137
+
138
+ return weights, densities
139
+
140
+
141
+ PRESETS = {
142
+ "equal": MergePreset("Equal", "Equal weights for all models", "dare_ties", "equal"),
143
+ "first_dominant": MergePreset("First Model Dominant", "Prioritize the first model", "dare_ties", "first_dominant"),
144
+ "last_dominant": MergePreset("Last Model Dominant", "Prioritize the last model", "dare_ties", "last_dominant"),
145
+ "coding_focus": MergePreset("Coding Focus", "Higher weight for code-related models", "dare_ties", "auto_detect"),
146
+ "balanced_slerp": MergePreset("Balanced SLERP", "50/50 interpolation between two models", "slerp", "equal"),
147
+ }
148
+
149
+
150
+ # ===== CONFIG GENERATION =====
151
+
152
+ @dataclass
153
+ class MergeConfig:
154
+ """Complete merge configuration."""
155
+ method: str = "dare_ties"
156
+ models: list[str] = field(default_factory=list)
157
+ base_model: str = ""
158
+ weights: list[float] = field(default_factory=list)
159
+ densities: list[float] = field(default_factory=list)
160
+ tokenizer_source: str = ""
161
+ dtype: str = "bfloat16"
162
+
163
+ # Method-specific params
164
+ slerp_t: float = 0.5
165
+ int8_mask: bool = True
166
+ normalize: bool = True
167
+
168
+ # Passthrough/slice params
169
+ slices: list[dict] = field(default_factory=list)
170
+
171
+ # Output
172
+ output_name: str = ""
173
+
174
+ def validate(self) -> list[str]:
175
+ """Validate the configuration. Returns list of error messages."""
176
+ errors = []
177
+ method_info = MERGE_METHODS.get(self.method)
178
+
179
+ if not method_info:
180
+ errors.append(f"Unknown merge method: {self.method}")
181
+ return errors
182
+
183
+ n = len(self.models)
184
+ if n < method_info["min_models"]:
185
+ errors.append(f"{method_info['name']} requires at least {method_info['min_models']} models")
186
+ if n > method_info["max_models"]:
187
+ errors.append(f"{method_info['name']} supports at most {method_info['max_models']} models")
188
+
189
+ if method_info["needs_base"] and not self.base_model:
190
+ errors.append(f"{method_info['name']} requires a base_model")
191
+
192
+ if "weight" in method_info["params"]:
193
+ if self.weights and len(self.weights) != n:
194
+ errors.append(f"Expected {n} weights, got {len(self.weights)}")
195
+ if self.weights and any(w < -1 or w > 2 for w in self.weights):
196
+ errors.append("Weights should be between -1 and 2")
197
+
198
+ if "density" in method_info["params"]:
199
+ if self.densities and len(self.densities) != n:
200
+ errors.append(f"Expected {n} densities, got {len(self.densities)}")
201
+ if self.densities and any(d < 0 or d > 1 for d in self.densities):
202
+ errors.append("Densities must be between 0 and 1")
203
+
204
+ if self.method == "slerp" and (self.slerp_t < 0 or self.slerp_t > 1):
205
+ errors.append("SLERP t parameter must be between 0 and 1")
206
+
207
+ if method_info.get("requires_slices") and not self.slices:
208
+ errors.append(f"{method_info['name']} requires slice definitions")
209
+
210
+ return errors
211
+
212
+
213
+ def generate_yaml(config: MergeConfig) -> str:
214
+ """Generate mergekit-compatible YAML configuration.
215
+
216
+ Args:
217
+ config: MergeConfig with all parameters
218
+
219
+ Returns:
220
+ YAML string ready for mergekit
221
+ """
222
+ errors = config.validate()
223
+ if errors:
224
+ return f"# VALIDATION ERRORS:\n" + "\n".join(f"# - {e}" for e in errors)
225
+
226
+ method_info = MERGE_METHODS[config.method]
227
+ doc = {}
228
+
229
+ # Passthrough uses slices format
230
+ if config.method == "passthrough":
231
+ doc["slices"] = config.slices or _default_slices(config)
232
+ doc["merge_method"] = config.method
233
+ doc["dtype"] = config.dtype
234
+ return yaml.dump(doc, default_flow_style=False, sort_keys=False)
235
+
236
+ # Standard methods
237
+ doc["merge_method"] = config.method
238
+
239
+ if method_info["needs_base"]:
240
+ doc["base_model"] = config.base_model
241
+
242
+ # Models with parameters
243
+ if config.method == "slerp":
244
+ doc["models"] = [{"model": m} for m in config.models]
245
+ doc["parameters"] = {"t": config.slerp_t}
246
+ else:
247
+ models_list = []
248
+ for i, model_id in enumerate(config.models):
249
+ entry = {"model": model_id}
250
+ params = {}
251
+ if "weight" in method_info["params"] and config.weights:
252
+ params["weight"] = config.weights[i]
253
+ if "density" in method_info["params"] and config.densities:
254
+ params["density"] = config.densities[i]
255
+ if params:
256
+ entry["parameters"] = params
257
+ models_list.append(entry)
258
+ doc["models"] = models_list
259
+
260
+ # Global parameters
261
+ global_params = {}
262
+ if "int8_mask" in method_info.get("global_params", []):
263
+ global_params["int8_mask"] = config.int8_mask
264
+ if "normalize" in method_info.get("global_params", []):
265
+ global_params["normalize"] = config.normalize
266
+
267
+ if global_params:
268
+ doc["parameters"] = global_params
269
+
270
+ doc["dtype"] = config.dtype
271
+
272
+ if config.tokenizer_source:
273
+ doc["tokenizer_source"] = config.tokenizer_source
274
+
275
+ return yaml.dump(doc, default_flow_style=False, sort_keys=False)
276
+
277
+
278
+ def _default_slices(config: MergeConfig) -> list[dict]:
279
+ """Generate default slice config for passthrough merges."""
280
+ slices = []
281
+ for model_id in config.models:
282
+ slices.append({
283
+ "sources": [{"model": model_id, "layer_range": [0, 32]}]
284
+ })
285
+ return slices
286
+
287
+
288
+ def generate_from_preset(
289
+ preset_name: str,
290
+ model_ids: list[str],
291
+ base_model: str = "",
292
+ tokenizer_source: str = "",
293
+ dtype: str = "bfloat16",
294
+ ) -> str:
295
+ """Quick config generation from a preset name.
296
+
297
+ Args:
298
+ preset_name: Key from PRESETS dict
299
+ model_ids: List of model IDs to merge
300
+ base_model: Base model for methods that need one
301
+ tokenizer_source: Which model's tokenizer to use
302
+ dtype: Data type for merge
303
+
304
+ Returns:
305
+ YAML string
306
+ """
307
+ preset = PRESETS.get(preset_name)
308
+ if not preset:
309
+ return f"# Unknown preset: {preset_name}\n# Available: {', '.join(PRESETS.keys())}"
310
+
311
+ weights, densities = preset.apply(model_ids)
312
+
313
+ config = MergeConfig(
314
+ method=preset.method,
315
+ models=model_ids,
316
+ base_model=base_model or (model_ids[0] if model_ids else ""),
317
+ weights=weights,
318
+ densities=densities,
319
+ tokenizer_source=tokenizer_source or base_model or (model_ids[0] if model_ids else ""),
320
+ dtype=dtype,
321
+ )
322
+
323
+ return generate_yaml(config)
324
+
325
+
326
+ def get_method_info(method: str) -> dict:
327
+ """Get human-readable info about a merge method."""
328
+ return MERGE_METHODS.get(method, {"name": "Unknown", "description": "Unknown method"})
model_info.py ADDED
@@ -0,0 +1,307 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """HuggingFace Hub API wrapper for model discovery and info retrieval."""
2
+
3
+ import json
4
+ import time
5
+ from dataclasses import dataclass, field
6
+ from typing import Optional
7
+ from functools import lru_cache
8
+
9
+ import requests
10
+
11
+ HF_API = "https://huggingface.co/api"
12
+ _session = requests.Session()
13
+ _session.headers.update({"Accept": "application/json"})
14
+
15
+ # Simple in-memory cache with TTL
16
+ _cache: dict[str, tuple[float, any]] = {}
17
+ CACHE_TTL = 300 # 5 minutes
18
+
19
+
20
+ def _cached_get(url: str, token: Optional[str] = None, ttl: int = CACHE_TTL) -> dict:
21
+ """GET with caching and rate-limit handling."""
22
+ now = time.time()
23
+ if url in _cache and (now - _cache[url][0]) < ttl:
24
+ return _cache[url][1]
25
+
26
+ headers = {}
27
+ if token:
28
+ headers["Authorization"] = f"Bearer {token}"
29
+
30
+ resp = _session.get(url, headers=headers, timeout=15)
31
+
32
+ if resp.status_code == 429:
33
+ retry = int(resp.headers.get("Retry-After", 5))
34
+ time.sleep(retry)
35
+ resp = _session.get(url, headers=headers, timeout=15)
36
+
37
+ resp.raise_for_status()
38
+ data = resp.json()
39
+ _cache[url] = (now, data)
40
+ return data
41
+
42
+
43
+ @dataclass
44
+ class ModelInfo:
45
+ """Parsed model information from HF Hub."""
46
+ model_id: str
47
+ model_type: str = "unknown"
48
+ architectures: list[str] = field(default_factory=list)
49
+ vocab_size: int = 0
50
+ hidden_size: int = 0
51
+ intermediate_size: int = 0
52
+ num_hidden_layers: int = 0
53
+ num_attention_heads: int = 0
54
+ num_key_value_heads: int = 0
55
+ max_position_embeddings: int = 0
56
+ torch_dtype: str = "unknown"
57
+ pipeline_tag: str = ""
58
+ tags: list[str] = field(default_factory=list)
59
+ downloads: int = 0
60
+ likes: int = 0
61
+ size_bytes: int = 0
62
+ gated: bool = False
63
+ private: bool = False
64
+ trust_remote_code: bool = False
65
+ error: Optional[str] = None
66
+
67
+ @property
68
+ def param_estimate(self) -> str:
69
+ """Rough parameter count estimate based on architecture."""
70
+ if self.size_bytes > 0:
71
+ # Rough: model files in bf16 β‰ˆ 2 bytes per param
72
+ params = self.size_bytes / 2
73
+ if params > 1e9:
74
+ return f"{params/1e9:.1f}B"
75
+ elif params > 1e6:
76
+ return f"{params/1e6:.0f}M"
77
+ return "unknown"
78
+
79
+ @property
80
+ def arch_signature(self) -> str:
81
+ """Unique signature for architecture matching."""
82
+ return f"{self.model_type}|{self.hidden_size}|{self.intermediate_size}"
83
+
84
+ @property
85
+ def display_name(self) -> str:
86
+ """Short display name (without org prefix)."""
87
+ return self.model_id.split("/")[-1] if "/" in self.model_id else self.model_id
88
+
89
+ @property
90
+ def ram_estimate_gb(self) -> float:
91
+ """Estimated RAM needed for merging (roughly 2.5x model size for bf16 merge)."""
92
+ if self.size_bytes > 0:
93
+ return round(self.size_bytes * 2.5 / (1024**3), 1)
94
+ return 0.0
95
+
96
+ def to_dict(self) -> dict:
97
+ return {
98
+ "model_id": self.model_id,
99
+ "model_type": self.model_type,
100
+ "architectures": self.architectures,
101
+ "vocab_size": self.vocab_size,
102
+ "hidden_size": self.hidden_size,
103
+ "intermediate_size": self.intermediate_size,
104
+ "num_hidden_layers": self.num_hidden_layers,
105
+ "num_attention_heads": self.num_attention_heads,
106
+ "torch_dtype": self.torch_dtype,
107
+ "pipeline_tag": self.pipeline_tag,
108
+ "downloads": self.downloads,
109
+ "likes": self.likes,
110
+ "param_estimate": self.param_estimate,
111
+ "ram_estimate_gb": self.ram_estimate_gb,
112
+ "gated": self.gated,
113
+ "private": self.private,
114
+ }
115
+
116
+
117
+ def fetch_model_info(model_id: str, token: Optional[str] = None) -> ModelInfo:
118
+ """Fetch comprehensive model information from HF Hub.
119
+
120
+ Args:
121
+ model_id: Full model ID (e.g., "Qwen/Qwen2.5-Coder-7B-Instruct")
122
+ token: Optional HF API token for gated/private models
123
+
124
+ Returns:
125
+ ModelInfo dataclass with all available information
126
+ """
127
+ info = ModelInfo(model_id=model_id)
128
+
129
+ # Fetch main model info
130
+ try:
131
+ data = _cached_get(f"{HF_API}/models/{model_id}", token=token)
132
+ except requests.exceptions.HTTPError as e:
133
+ if e.response.status_code == 401:
134
+ info.error = "Gated or private model β€” HF token required"
135
+ info.gated = True
136
+ elif e.response.status_code == 404:
137
+ info.error = f"Model not found: {model_id}"
138
+ else:
139
+ info.error = f"API error: {e.response.status_code}"
140
+ return info
141
+ except Exception as e:
142
+ info.error = f"Connection error: {str(e)}"
143
+ return info
144
+
145
+ # Parse basic metadata
146
+ info.pipeline_tag = data.get("pipeline_tag", "")
147
+ info.tags = data.get("tags", [])
148
+ info.downloads = data.get("downloads", 0)
149
+ info.likes = data.get("likes", 0)
150
+ info.gated = data.get("gated", False) not in (False, None)
151
+ info.private = data.get("private", False)
152
+
153
+ # Parse config (architecture details)
154
+ config = data.get("config", {})
155
+ if config:
156
+ info.model_type = config.get("model_type", "unknown")
157
+ info.architectures = config.get("architectures", [])
158
+
159
+ # Fetch full config.json for detailed architecture info
160
+ # (the API endpoint only returns basic config fields)
161
+ try:
162
+ full_config = _cached_get(
163
+ f"https://huggingface.co/{model_id}/resolve/main/config.json",
164
+ token=token,
165
+ )
166
+ info.model_type = full_config.get("model_type", info.model_type)
167
+ info.architectures = full_config.get("architectures", info.architectures)
168
+ info.vocab_size = full_config.get("vocab_size", 0)
169
+ info.hidden_size = full_config.get("hidden_size", 0)
170
+ info.intermediate_size = full_config.get("intermediate_size", 0)
171
+ info.num_hidden_layers = full_config.get("num_hidden_layers", 0)
172
+ info.num_attention_heads = full_config.get("num_attention_heads", 0)
173
+ info.num_key_value_heads = full_config.get("num_key_value_heads", 0)
174
+ info.max_position_embeddings = full_config.get("max_position_embeddings", 0)
175
+ info.torch_dtype = full_config.get("torch_dtype", "unknown")
176
+
177
+ if "auto_map" in full_config:
178
+ info.trust_remote_code = True
179
+ except Exception:
180
+ # Fall back to basic config from API
181
+ if config:
182
+ info.vocab_size = config.get("vocab_size", 0)
183
+ info.hidden_size = config.get("hidden_size", 0)
184
+ else:
185
+ info.error = "Could not fetch config.json β€” model may need trust_remote_code=True"
186
+ info.trust_remote_code = True
187
+
188
+ # Estimate total model size from siblings (files)
189
+ siblings = data.get("siblings", [])
190
+ total_size = 0
191
+ for f in siblings:
192
+ fname = f.get("rfilename", "")
193
+ size = f.get("size", 0) or 0
194
+ # Count only model weight files
195
+ if any(fname.endswith(ext) for ext in
196
+ [".safetensors", ".bin", ".pt", ".pth", ".gguf"]):
197
+ total_size += size
198
+ info.size_bytes = total_size
199
+
200
+ return info
201
+
202
+
203
+ def search_models(
204
+ query: str = "",
205
+ author: str = "",
206
+ architecture: str = "",
207
+ limit: int = 20,
208
+ sort: str = "downloads",
209
+ token: Optional[str] = None,
210
+ ) -> list[dict]:
211
+ """Search HuggingFace Hub for models.
212
+
213
+ Args:
214
+ query: Search query string
215
+ author: Filter by author/organization
216
+ architecture: Filter by model_type (e.g., "llama", "qwen2")
217
+ limit: Max results to return
218
+ sort: Sort by "downloads", "likes", "created", "modified"
219
+ token: Optional HF API token
220
+
221
+ Returns:
222
+ List of dicts with basic model info
223
+ """
224
+ params = {
225
+ "limit": min(limit, 100),
226
+ "sort": sort,
227
+ "direction": -1,
228
+ "config": True,
229
+ }
230
+ if query:
231
+ params["search"] = query
232
+ if author:
233
+ params["author"] = author
234
+
235
+ url = f"{HF_API}/models"
236
+ try:
237
+ data = _cached_get(
238
+ f"{url}?{'&'.join(f'{k}={v}' for k, v in params.items())}",
239
+ token=token,
240
+ ttl=60, # shorter cache for search
241
+ )
242
+ except Exception as e:
243
+ return [{"error": str(e)}]
244
+
245
+ results = []
246
+ for m in data:
247
+ config = m.get("config", {}) or {}
248
+ model_type = config.get("model_type", "")
249
+
250
+ # Filter by architecture if specified
251
+ if architecture and model_type.lower() != architecture.lower():
252
+ continue
253
+
254
+ results.append({
255
+ "model_id": m.get("modelId", ""),
256
+ "model_type": model_type,
257
+ "pipeline_tag": m.get("pipeline_tag", ""),
258
+ "downloads": m.get("downloads", 0),
259
+ "likes": m.get("likes", 0),
260
+ "tags": m.get("tags", [])[:5],
261
+ })
262
+
263
+ return results[:limit]
264
+
265
+
266
+ def get_popular_base_models(architecture: str = "", token: Optional[str] = None) -> list[dict]:
267
+ """Get popular base models for a given architecture type.
268
+
269
+ Useful for suggesting base_model in merge configs.
270
+ """
271
+ # Common base models by architecture
272
+ known_bases = {
273
+ "llama": [
274
+ "meta-llama/Llama-3.1-8B-Instruct",
275
+ "meta-llama/Llama-3.1-70B-Instruct",
276
+ "meta-llama/Llama-2-7b-hf",
277
+ ],
278
+ "mistral": [
279
+ "mistralai/Mistral-7B-Instruct-v0.3",
280
+ "mistralai/Mixtral-8x7B-Instruct-v0.1",
281
+ ],
282
+ "qwen2": [
283
+ "Qwen/Qwen2.5-7B-Instruct",
284
+ "Qwen/Qwen2.5-14B-Instruct",
285
+ "Qwen/Qwen2.5-3B-Instruct",
286
+ "Qwen/Qwen2.5-72B-Instruct",
287
+ ],
288
+ "gemma2": [
289
+ "google/gemma-2-9b-it",
290
+ "google/gemma-2-27b-it",
291
+ ],
292
+ "phi3": [
293
+ "microsoft/Phi-3-mini-4k-instruct",
294
+ "microsoft/Phi-3-medium-4k-instruct",
295
+ ],
296
+ }
297
+
298
+ if architecture.lower() in known_bases:
299
+ return [{"model_id": m} for m in known_bases[architecture.lower()]]
300
+
301
+ # Fallback: search for popular instruct models
302
+ return search_models(
303
+ query=f"{architecture} instruct",
304
+ limit=5,
305
+ sort="downloads",
306
+ token=token,
307
+ )
notebook_generator.py ADDED
@@ -0,0 +1,484 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Google Colab notebook generator for model merging, quantization, and deployment."""
2
+
3
+ import json
4
+ from typing import Optional
5
+ from .config_generator import MergeConfig, generate_yaml, MERGE_METHODS
6
+
7
+
8
+ def _cell(source: str, cell_type: str = "code") -> dict:
9
+ """Create a notebook cell."""
10
+ return {
11
+ "cell_type": cell_type,
12
+ "metadata": {},
13
+ "source": source.split("\n"),
14
+ "outputs": [] if cell_type == "code" else [],
15
+ **({"execution_count": None} if cell_type == "code" else {}),
16
+ }
17
+
18
+
19
+ def _md(text: str) -> dict:
20
+ return _cell(text, "markdown")
21
+
22
+
23
+ def generate_merge_notebook(
24
+ config: MergeConfig,
25
+ output_model_name: str = "",
26
+ hf_username: str = "",
27
+ include_quantize: bool = True,
28
+ include_deploy: bool = True,
29
+ quant_types: Optional[list[str]] = None,
30
+ ) -> dict:
31
+ """Generate a complete Colab notebook for merging models.
32
+
33
+ Args:
34
+ config: MergeConfig with all merge parameters
35
+ output_model_name: Name for the merged model (e.g., "My-Merged-7B")
36
+ hf_username: HF username for upload (e.g., "AIencoder")
37
+ include_quantize: Include GGUF quantization cells
38
+ include_deploy: Include HF Space deployment cells
39
+ quant_types: List of quantization types (default: ["Q5_K_M", "Q4_K_M"])
40
+
41
+ Returns:
42
+ Complete notebook dict (nbformat v4)
43
+ """
44
+ if quant_types is None:
45
+ quant_types = ["Q5_K_M", "Q4_K_M"]
46
+
47
+ if not output_model_name:
48
+ output_model_name = "ForgeKit-Merged-Model"
49
+
50
+ yaml_config = generate_yaml(config)
51
+ method_info = MERGE_METHODS.get(config.method, {})
52
+
53
+ # Estimate RAM for Colab runtime recommendation
54
+ ram_note = ""
55
+ if config.models:
56
+ n_models = len(config.models)
57
+ # Rough heuristic
58
+ if any("14b" in m.lower() or "13b" in m.lower() for m in config.models):
59
+ ram_note = "⚠️ 14B models need **High-RAM runtime** (48GB). Go to Runtime β†’ Change runtime β†’ High-RAM."
60
+ elif any("70b" in m.lower() for m in config.models):
61
+ ram_note = "⚠️ 70B models need **A100 GPU** (Colab Pro+). This won't work on free tier."
62
+ elif any("7b" in m.lower() or "8b" in m.lower() for m in config.models):
63
+ ram_note = "πŸ’‘ 7-8B models work on **High-RAM CPU** runtime (free tier). No GPU needed."
64
+
65
+ cells = []
66
+
67
+ # ===== HEADER =====
68
+ cells.append(_md(f"""# πŸ”₯ ForgeKit β€” Model Merge Notebook
69
+
70
+ **Generated by [ForgeKit](https://huggingface.co/spaces/AIencoder/ForgeKit)**
71
+
72
+ This notebook will:
73
+ 1. βœ… Install mergekit and dependencies
74
+ 2. βœ… Merge your selected models using **{method_info.get('name', config.method)}**
75
+ 3. {'βœ…' if include_quantize else '⬜'} Quantize to GGUF format
76
+ 4. {'βœ…' if include_deploy else '⬜'} Upload to HuggingFace Hub
77
+
78
+ **Models being merged:**
79
+ {chr(10).join(f'- `{m}`' for m in config.models)}
80
+
81
+ **Method:** {method_info.get('name', config.method)} β€” {method_info.get('description', '')}
82
+
83
+ {ram_note}
84
+
85
+ ---
86
+ ⚑ **Quick Start:** Click **Runtime β†’ Run all** to execute everything."""))
87
+
88
+ # ===== CELL 1: INSTALL =====
89
+ cells.append(_md("## 1️⃣ Install Dependencies"))
90
+ cells.append(_cell("""# Install mergekit and dependencies
91
+ !pip install -q mergekit[all] huggingface_hub transformers accelerate
92
+ !pip install -q pyyaml sentencepiece protobuf
93
+
94
+ print("βœ… All dependencies installed!")"""))
95
+
96
+ # ===== CELL 2: HF LOGIN =====
97
+ cells.append(_md("## 2️⃣ HuggingFace Login\nRequired for downloading gated models and uploading your merge."))
98
+ cells.append(_cell("""from huggingface_hub import notebook_login
99
+ notebook_login()"""))
100
+
101
+ # ===== CELL 3: CONFIG =====
102
+ cells.append(_md(f"""## 3️⃣ Merge Configuration
103
+
104
+ Your merge config (auto-generated by ForgeKit). Edit the YAML below if you want to tweak weights or parameters."""))
105
+
106
+ escaped_yaml = yaml_config.replace('"', '\\"')
107
+ cells.append(_cell(f"""# === CONFIGURATION ===
108
+ MODEL_NAME = "{output_model_name}"
109
+ USERNAME = "{hf_username}" # Change to your HF username
110
+
111
+ YAML_CONFIG = \"\"\"
112
+ {yaml_config}\"\"\"
113
+
114
+ # Display the config
115
+ print("πŸ“‹ Merge Configuration:")
116
+ print("=" * 50)
117
+ print(YAML_CONFIG)
118
+ print("=" * 50)
119
+ print(f"\\nπŸ“¦ Output: {{USERNAME}}/{{MODEL_NAME}}" if USERNAME else f"\\nπŸ“¦ Output: {{MODEL_NAME}}")"""))
120
+
121
+ # ===== CELL 4: MERGE =====
122
+ cells.append(_md("""## 4️⃣ Execute Merge
123
+
124
+ This is the main merge step. Time depends on model sizes:
125
+ | Size | Estimated Time |
126
+ |------|---------------|
127
+ | 1-3B | 5-15 min |
128
+ | 7B | 15-30 min |
129
+ | 14B | 30-60 min |"""))
130
+
131
+ cells.append(_cell("""import yaml
132
+ import os
133
+ import time
134
+
135
+ # Write config to file
136
+ with open("merge_config.yaml", "w") as f:
137
+ f.write(YAML_CONFIG)
138
+
139
+ # Create output directory
140
+ os.makedirs("merged_model", exist_ok=True)
141
+
142
+ print("πŸ”₯ Starting merge...")
143
+ print(f" Method: {yaml.safe_load(YAML_CONFIG).get('merge_method', 'unknown')}")
144
+ print(f" Models: {len(yaml.safe_load(YAML_CONFIG).get('models', []))}")
145
+ print()
146
+
147
+ start = time.time()
148
+
149
+ # Run mergekit
150
+ !mergekit-yaml merge_config.yaml merged_model --copy-tokenizer --allow-crimes --lazy-unpickle
151
+
152
+ elapsed = time.time() - start
153
+ print(f"\\nβœ… Merge complete in {elapsed/60:.1f} minutes!")
154
+ print(f"πŸ“ Output: ./merged_model/")
155
+
156
+ # Show output size
157
+ total = sum(
158
+ os.path.getsize(os.path.join("merged_model", f))
159
+ for f in os.listdir("merged_model")
160
+ if os.path.isfile(os.path.join("merged_model", f))
161
+ )
162
+ print(f"πŸ’Ύ Total size: {total / (1024**3):.2f} GB")"""))
163
+
164
+ # ===== CELL 5: TEST =====
165
+ cells.append(_md("## 5️⃣ Quick Test\nVerify the merged model loads and generates text."))
166
+ cells.append(_cell("""from transformers import AutoTokenizer, AutoModelForCausalLM
167
+ import torch
168
+
169
+ print("πŸ§ͺ Loading merged model for testing...")
170
+
171
+ tokenizer = AutoTokenizer.from_pretrained("merged_model", trust_remote_code=True)
172
+ model = AutoModelForCausalLM.from_pretrained(
173
+ "merged_model",
174
+ torch_dtype=torch.bfloat16,
175
+ device_map="auto",
176
+ trust_remote_code=True,
177
+ )
178
+
179
+ # Test prompts
180
+ test_prompts = [
181
+ "Write a Python function to calculate fibonacci numbers:",
182
+ "Explain what machine learning is in simple terms:",
183
+ "What is 15 * 23 + 7?",
184
+ ]
185
+
186
+ print("\\n" + "=" * 60)
187
+ for prompt in test_prompts:
188
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
189
+ with torch.no_grad():
190
+ output = model.generate(
191
+ **inputs,
192
+ max_new_tokens=100,
193
+ do_sample=False,
194
+ temperature=1.0,
195
+ )
196
+ response = tokenizer.decode(output[0], skip_special_tokens=True)
197
+ print(f"\\nπŸ“ Prompt: {prompt}")
198
+ print(f"πŸ€– Response: {response[len(prompt):].strip()[:200]}...")
199
+ print("-" * 60)
200
+
201
+ print("\\nβœ… Model test complete!")
202
+
203
+ # Clean up GPU memory
204
+ del model
205
+ torch.cuda.empty_cache() if torch.cuda.is_available() else None"""))
206
+
207
+ # ===== CELL 6: UPLOAD =====
208
+ cells.append(_md("## 6️⃣ Upload to HuggingFace Hub"))
209
+
210
+ model_card = _generate_model_card(config, output_model_name, hf_username)
211
+ escaped_card = model_card.replace('"""', '\\"\\"\\"')
212
+
213
+ cells.append(_cell(f"""from huggingface_hub import HfApi, create_repo
214
+
215
+ REPO_ID = f"{{USERNAME}}/{{MODEL_NAME}}" if USERNAME else MODEL_NAME
216
+
217
+ # Create repo
218
+ try:
219
+ create_repo(REPO_ID, exist_ok=True, repo_type="model")
220
+ print(f"πŸ“¦ Repo ready: https://huggingface.co/{{REPO_ID}}")
221
+ except Exception as e:
222
+ print(f"⚠️ Repo creation: {{e}}")
223
+
224
+ # Write model card
225
+ MODEL_CARD = \"\"\"{model_card}\"\"\"
226
+
227
+ with open("merged_model/README.md", "w") as f:
228
+ f.write(MODEL_CARD)
229
+
230
+ # Upload
231
+ api = HfApi()
232
+ print("⬆️ Uploading merged model (this may take a while)...")
233
+ api.upload_folder(
234
+ repo_id=REPO_ID,
235
+ folder_path="merged_model",
236
+ commit_message=f"Upload {{MODEL_NAME}} merged with ForgeKit",
237
+ )
238
+ print(f"\\nβœ… Model uploaded!")
239
+ print(f"πŸ”— https://huggingface.co/{{REPO_ID}}")"""))
240
+
241
+ # ===== CELL 7: QUANTIZE (optional) =====
242
+ if include_quantize:
243
+ cells.append(_md(f"""## 7️⃣ Quantize to GGUF
244
+
245
+ Convert to GGUF format for use with llama.cpp, Ollama, LM Studio, etc.
246
+
247
+ **Quantization types:** {', '.join(quant_types)}"""))
248
+
249
+ quant_cmds = "\n".join(
250
+ f' !./llama.cpp/llama-quantize model-f16.gguf {output_model_name}-{q}.gguf {q}\n'
251
+ f' print(f"βœ… {q} done: {output_model_name}-{q}.gguf")'
252
+ for q in quant_types
253
+ )
254
+
255
+ cells.append(_cell(f"""import os
256
+
257
+ print("πŸ“¦ Setting up llama.cpp for GGUF conversion...")
258
+
259
+ # Clone and build llama.cpp
260
+ if not os.path.exists("llama.cpp"):
261
+ !git clone --depth 1 https://github.com/ggerganov/llama.cpp
262
+ !cd llama.cpp && make -j$(nproc) llama-quantize
263
+
264
+ # Install conversion deps
265
+ !pip install -q gguf
266
+
267
+ # Convert to f16 GGUF first
268
+ print("\\nπŸ”„ Converting to GGUF (f16)...")
269
+ !python llama.cpp/convert_hf_to_gguf.py merged_model --outfile model-f16.gguf --outtype f16
270
+
271
+ # Quantize to each target
272
+ print("\\nπŸ—œοΈ Quantizing...")
273
+ if os.path.exists("model-f16.gguf"):
274
+ {quant_cmds}
275
+
276
+ # Show file sizes
277
+ print("\\nπŸ“Š Output sizes:")
278
+ for f in os.listdir("."):
279
+ if f.endswith(".gguf"):
280
+ size_gb = os.path.getsize(f) / (1024**3)
281
+ print(f" {{f}}: {{size_gb:.2f}} GB")
282
+ else:
283
+ print("❌ f16 conversion failed. Check errors above.")"""))
284
+
285
+ # Upload GGUFs
286
+ cells.append(_cell(f"""# Upload GGUF files to the same repo
287
+ import os
288
+ from huggingface_hub import HfApi
289
+
290
+ api = HfApi()
291
+ REPO_ID = f"{{USERNAME}}/{{MODEL_NAME}}" if USERNAME else MODEL_NAME
292
+
293
+ gguf_files = [f for f in os.listdir(".") if f.endswith(".gguf") and f != "model-f16.gguf"]
294
+
295
+ for gf in gguf_files:
296
+ print(f"⬆️ Uploading {{gf}}...")
297
+ api.upload_file(
298
+ path_or_fileobj=gf,
299
+ path_in_repo=gf,
300
+ repo_id=REPO_ID,
301
+ )
302
+ print(f" βœ… Done")
303
+
304
+ print(f"\\nπŸŽ‰ All GGUF files uploaded to https://huggingface.co/{{REPO_ID}}")"""))
305
+
306
+ # ===== CELL 8: DEPLOY (optional) =====
307
+ if include_deploy:
308
+ cells.append(_md("""## 8️⃣ Deploy to HuggingFace Space
309
+
310
+ Create a Gradio chat Space running your merged model."""))
311
+
312
+ cells.append(_cell(f"""from huggingface_hub import HfApi, create_repo
313
+
314
+ SPACE_ID = f"{{USERNAME}}/{{MODEL_NAME}}-chat" if USERNAME else f"{{MODEL_NAME}}-chat"
315
+ REPO_ID = f"{{USERNAME}}/{{MODEL_NAME}}" if USERNAME else MODEL_NAME
316
+
317
+ # Create Space
318
+ try:
319
+ create_repo(SPACE_ID, repo_type="space", space_sdk="gradio", exist_ok=True)
320
+ print(f"πŸš€ Space created: https://huggingface.co/spaces/{{SPACE_ID}}")
321
+ except Exception as e:
322
+ print(f"⚠️ {{e}}")
323
+
324
+ # Generate app.py
325
+ APP_CODE = '''import gradio as gr
326
+ from transformers import AutoTokenizer, AutoModelForCausalLM, TextIteratorStreamer
327
+ import torch
328
+ from threading import Thread
329
+
330
+ MODEL_ID = "{hf_username}/{output_model_name}" if "{hf_username}" else "{output_model_name}"
331
+
332
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
333
+ model = AutoModelForCausalLM.from_pretrained(
334
+ MODEL_ID, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
335
+ )
336
+
337
+ def chat(message, history):
338
+ messages = []
339
+ for h in history:
340
+ messages.append({{"role": "user", "content": h[0]}})
341
+ if h[1]:
342
+ messages.append({{"role": "assistant", "content": h[1]}})
343
+ messages.append({{"role": "user", "content": message}})
344
+
345
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
346
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
347
+ streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
348
+
349
+ thread = Thread(target=model.generate, kwargs={{
350
+ **inputs, "max_new_tokens": 512, "streamer": streamer, "do_sample": True, "temperature": 0.7
351
+ }})
352
+ thread.start()
353
+
354
+ response = ""
355
+ for token in streamer:
356
+ response += token
357
+ yield response
358
+
359
+ demo = gr.ChatInterface(chat, title="πŸ”₯ {output_model_name}", description="Merged with ForgeKit")
360
+ demo.launch()
361
+ '''
362
+
363
+ api = HfApi()
364
+
365
+ # Upload app.py
366
+ api.upload_file(
367
+ path_or_fileobj=APP_CODE.encode(),
368
+ path_in_repo="app.py",
369
+ repo_id=SPACE_ID,
370
+ repo_type="space",
371
+ )
372
+
373
+ # Upload requirements.txt
374
+ reqs = "transformers\\ntorch\\naccelerate\\nsentencepiece\\nprotobuf"
375
+ api.upload_file(
376
+ path_or_fileobj=reqs.encode(),
377
+ path_in_repo="requirements.txt",
378
+ repo_id=SPACE_ID,
379
+ repo_type="space",
380
+ )
381
+
382
+ print(f"\\nπŸŽ‰ Space deployed!")
383
+ print(f"πŸ”— https://huggingface.co/spaces/{{SPACE_ID}}")
384
+ print(f"\\n⏳ It may take a few minutes to build and start.")"""))
385
+
386
+ # ===== DONE =====
387
+ cells.append(_md(f"""## πŸŽ‰ All Done!
388
+
389
+ Your merged model **{output_model_name}** is ready. Here's what was created:
390
+
391
+ | Output | Link |
392
+ |--------|------|
393
+ | Model | `https://huggingface.co/{hf_username or 'YOUR_USERNAME'}/{output_model_name}` |
394
+ {'| GGUF Files | Same repo (quantized versions) |' if include_quantize else ''}
395
+ {'| Chat Space | `https://huggingface.co/spaces/' + (hf_username or 'YOUR_USERNAME') + '/' + output_model_name + '-chat` |' if include_deploy else ''}
396
+
397
+ ---
398
+
399
+ **Made with [ForgeKit](https://huggingface.co/spaces/AIencoder/ForgeKit)** β€” Forge your perfect AI model πŸ”₯"""))
400
+
401
+ # ===== BUILD NOTEBOOK =====
402
+ notebook = {
403
+ "nbformat": 4,
404
+ "nbformat_minor": 5,
405
+ "metadata": {
406
+ "kernelspec": {
407
+ "display_name": "Python 3",
408
+ "language": "python",
409
+ "name": "python3",
410
+ },
411
+ "language_info": {"name": "python", "version": "3.10.0"},
412
+ "colab": {
413
+ "provenance": [],
414
+ "gpuType": "T4",
415
+ },
416
+ "accelerator": "GPU",
417
+ },
418
+ "cells": cells,
419
+ }
420
+
421
+ return notebook
422
+
423
+
424
+ def _generate_model_card(config: MergeConfig, name: str, username: str) -> str:
425
+ """Generate a model card README.md for the merged model."""
426
+ method_info = MERGE_METHODS.get(config.method, {})
427
+ models_list = "\n".join(f"- [{m}](https://huggingface.co/{m})" for m in config.models)
428
+ base_link = f"[{config.base_model}](https://huggingface.co/{config.base_model})" if config.base_model else "N/A"
429
+
430
+ return f"""---
431
+ tags:
432
+ - merge
433
+ - mergekit
434
+ - forgekit
435
+ base_model: {config.base_model or config.models[0] if config.models else ''}
436
+ license: apache-2.0
437
+ ---
438
+
439
+ # {name}
440
+
441
+ This model was created using **[ForgeKit](https://huggingface.co/spaces/AIencoder/ForgeKit)** β€” an open-source model merging platform.
442
+
443
+ ## Merge Details
444
+
445
+ | Parameter | Value |
446
+ |-----------|-------|
447
+ | **Method** | {method_info.get('name', config.method)} |
448
+ | **Base Model** | {base_link} |
449
+ | **dtype** | {config.dtype} |
450
+
451
+ ### Source Models
452
+
453
+ {models_list}
454
+
455
+ ### Configuration
456
+
457
+ ```yaml
458
+ {generate_yaml(config)}
459
+ ```
460
+
461
+ ## Usage
462
+
463
+ ```python
464
+ from transformers import AutoTokenizer, AutoModelForCausalLM
465
+
466
+ tokenizer = AutoTokenizer.from_pretrained("{username}/{name}" if "{username}" else "{name}")
467
+ model = AutoModelForCausalLM.from_pretrained("{username}/{name}" if "{username}" else "{name}")
468
+ ```
469
+
470
+ ---
471
+
472
+ *Made with [ForgeKit](https://huggingface.co/spaces/AIencoder/ForgeKit)* πŸ”₯
473
+ """
474
+
475
+
476
+ def notebook_to_json(notebook: dict) -> str:
477
+ """Serialize notebook to JSON string."""
478
+ return json.dumps(notebook, indent=2, ensure_ascii=False)
479
+
480
+
481
+ def save_notebook(notebook: dict, path: str):
482
+ """Save notebook to .ipynb file."""
483
+ with open(path, "w", encoding="utf-8") as f:
484
+ json.dump(notebook, f, indent=2, ensure_ascii=False)
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ gradio>=4.0.0
2
+ huggingface_hub>=0.20.0
3
+ requests>=2.28.0
4
+ pyyaml>=6.0
5
+ nbformat>=5.7.0