Spaces:

AIencoder
/

Forgekit

Sleeping

App Files Files Community

AIencoder commited on 9 days ago

Commit

0249933

verified ·

1 Parent(s): 589e50e

Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

README.md +74 -7
app.py +670 -0
compatibility.py +321 -0
config_generator.py +328 -0
model_info.py +307 -0
notebook_generator.py +484 -0
requirements.txt +5 -0

README.md CHANGED Viewed

@@ -1,12 +1,79 @@
 ---
 title: Forgekit
-emoji: 🐠
-colorFrom: gray
-colorTo: indigo
-sdk: gradio
-sdk_version: 6.5.1
 app_file: app.py
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Forgekit
 app_file: app.py
+sdk: gradio
+sdk_version: 5.42.0
 ---
+# 🔥 ForgeKit
+**Forge your perfect AI model — no code required.**
+ForgeKit is an open-source platform that lets anyone create custom AI models by merging existing ones. No coding, no complex setup — just pick your models, configure the merge, and get a ready-to-run Colab notebook.
+## ✨ Features
+### ⚒️ Merge Builder
+- Add models by ID and instantly check architecture compatibility
+- Choose from 6 merge methods: DARE-TIES, TIES, SLERP, Linear, Task Arithmetic, Passthrough
+- Adjust weights and densities with smart presets
+- Auto-suggest base model and tokenizer
+- Generate ready-to-run Colab notebooks with one click
+### 🔍 Model Explorer
+- Search HuggingFace Hub for models
+- Filter by architecture type
+- View detailed model specs (hidden size, layers, vocab, etc.)
+### 📦 GGUF Quantizer
+- Convert any HF model to GGUF format
+- Multiple quantization levels (Q8_0, Q5_K_M, Q4_K_M, etc.)
+- Ready-to-run Colab notebook generation
+### 🚀 Deploy
+- Generate deployment files for HuggingFace Spaces
+- Gradio chat interface or Docker + llama.cpp options
+- Auto-generated app.py and README
+### 🏆 Community Leaderboard
+- Browse community-created merges
+- Submit your own merged models
+- Discover popular merge recipes
+## 🛠️ Supported Merge Methods
+| Method | Models | Best For |
+|--------|--------|----------|
+| **DARE-TIES** | 2-10 | Combining specialists (coding + math) |
+| **TIES** | 2-10 | Resolving parameter interference |
+| **SLERP** | 2 | Smooth two-model interpolation |
+| **Linear** | 2-10 | Simple weighted averaging |
+| **Task Arithmetic** | 1-10 | Adding/removing capabilities |
+| **Passthrough** | 1-10 | Layer stacking (Frankenmerge) |
+## 🚀 How It Works
+1. **Add Models** — Enter HuggingFace model IDs
+2. **Check Compatibility** — ForgeKit verifies architectures match
+3. **Configure** — Choose method, adjust weights, pick presets
+4. **Generate** — Get a Colab notebook with everything pre-filled
+5. **Run** — Open in Colab, click Run All, wait for your model
+6. **Ship** — Auto-upload to HF Hub + optional GGUF + Space deployment
+## 📋 Requirements
+The generated Colab notebooks handle all dependencies. You just need:
+- A Google account (for Colab)
+- A HuggingFace account (for model access and upload)
+- A HF token (for gated models and uploading)
+## 🧑‍💻 Built By
+**[AIencoder](https://huggingface.co/AIencoder)** — AI/ML Engineer
+- [Portfolio](https://aiencoder-portfolio.static.hf.space)
+- [GitHub](https://github.com/Ary5272)
+## 📄 License
+MIT — use it, fork it, improve it.

app.py ADDED Viewed

	@@ -0,0 +1,670 @@

+"""ForgeKit — Forge your perfect AI model, no code required.
+Main Gradio application with 5 tabs:
+1. Merge Builder — Visual merge configuration + notebook generation
+2. Model Explorer — Search and discover HF models
+3. GGUF Quantizer — Generate quantization notebooks
+4. Deploy — Generate deployment files for HF Spaces
+5. Leaderboard — Community merge rankings
+"""
+import gradio as gr
+import json
+import tempfile
+import os
+from forgekit.model_info import fetch_model_info, search_models
+from forgekit.compatibility import check_compatibility, quick_check
+from forgekit.config_generator import (
+    MergeConfig, generate_yaml, generate_from_preset,
+    MERGE_METHODS, PRESETS,
+)
+from forgekit.notebook_generator import generate_merge_notebook, save_notebook
+# ===== THEME =====
+theme = gr.themes.Base(
+    primary_hue=gr.themes.colors.amber,
+    secondary_hue=gr.themes.colors.purple,
+    neutral_hue=gr.themes.colors.gray,
+    font=gr.themes.GoogleFont("Inter"),
+    font_mono=gr.themes.GoogleFont("JetBrains Mono"),
+).set(
+    body_background_fill="#0a0a0f",
+    body_background_fill_dark="#0a0a0f",
+    body_text_color="#e5e5e5",
+    body_text_color_dark="#e5e5e5",
+    block_background_fill="#111118",
+    block_background_fill_dark="#111118",
+    block_border_color="#1f1f2e",
+    block_border_color_dark="#1f1f2e",
+    block_label_text_color="#9ca3af",
+    block_label_text_color_dark="#9ca3af",
+    block_title_text_color="#e5e5e5",
+    block_title_text_color_dark="#e5e5e5",
+    input_background_fill="#16161f",
+    input_background_fill_dark="#16161f",
+    input_border_color="#2a2a3a",
+    input_border_color_dark="#2a2a3a",
+    button_primary_background_fill="linear-gradient(to right, #f59e0b, #f97316)",
+    button_primary_background_fill_dark="linear-gradient(to right, #f59e0b, #f97316)",
+    button_primary_text_color="#ffffff",
+    button_primary_text_color_dark="#ffffff",
+    button_secondary_background_fill="#1f1f2e",
+    button_secondary_background_fill_dark="#1f1f2e",
+    button_secondary_text_color="#e5e5e5",
+    button_secondary_text_color_dark="#e5e5e5",
+)
+CSS = """
+.forgekit-header { text-align: center; padding: 1.5rem 0 1rem; }
+.forgekit-header h1 { font-size: 2.5rem; font-weight: 800; margin: 0;
+    background: linear-gradient(135deg, #a855f7, #ec4899, #f59e0b);
+    -webkit-background-clip: text; -webkit-text-fill-color: transparent; }
+.forgekit-header p { color: #9ca3af; font-size: 1rem; margin-top: 0.25rem; }
+.status-ok { color: #4ade80; font-weight: 600; }
+.status-warn { color: #fbbf24; font-weight: 600; }
+.status-err { color: #f87171; font-weight: 600; }
+.method-card { border: 1px solid #2a2a3a; border-radius: 12px; padding: 1rem; margin: 0.25rem 0; }
+footer { display: none !important; }
+"""
+# ===== CALLBACKS =====
+def check_models(models_text: str, token: str) -> tuple[str, str]:
+    """Check model compatibility and return report + quick status."""
+    models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
+    if len(models) < 2:
+        return "⚠️ Add at least 2 models (one per line)", ""
+    tok = token.strip() if token else None
+    report = check_compatibility(models, token=tok)
+    quick = quick_check(models, token=tok)
+    return report.to_markdown(), quick
+def generate_config(
+    models_text: str, method: str, base_model: str,
+    weights_text: str, densities_text: str,
+    tokenizer_src: str, dtype: str,
+    slerp_t: float, int8_mask: bool, normalize: bool,
+) -> str:
+    """Generate YAML config from UI inputs."""
+    models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
+    if not models:
+        return "# Add models first"
+    # Parse weights
+    weights = []
+    if weights_text.strip():
+        try:
+            weights = [float(w.strip()) for w in weights_text.split(",")]
+        except ValueError:
+            return "# Invalid weights — use comma-separated numbers"
+    densities = []
+    if densities_text.strip():
+        try:
+            densities = [float(d.strip()) for d in densities_text.split(",")]
+        except ValueError:
+            return "# Invalid densities — use comma-separated numbers"
+    config = MergeConfig(
+        method=method,
+        models=models,
+        base_model=base_model.strip(),
+        weights=weights,
+        densities=densities,
+        tokenizer_source=tokenizer_src.strip(),
+        dtype=dtype,
+        slerp_t=slerp_t,
+        int8_mask=int8_mask,
+        normalize=normalize,
+    )
+    return generate_yaml(config)
+def apply_preset(preset_name: str, models_text: str) -> tuple[str, str]:
+    """Apply a preset and return weights + densities strings."""
+    models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
+    if not models:
+        return "", ""
+    preset = PRESETS.get(preset_name)
+    if not preset:
+        return "", ""
+    weights, densities = preset.apply(models)
+    return ", ".join(str(w) for w in weights), ", ".join(str(d) for d in densities)
+def generate_notebook_file(
+    models_text: str, method: str, base_model: str,
+    weights_text: str, densities_text: str,
+    tokenizer_src: str, dtype: str,
+    slerp_t: float, int8_mask: bool, normalize: bool,
+    output_name: str, hf_user: str,
+    inc_quantize: bool, inc_deploy: bool,
+    quant_types_text: str,
+) -> str | None:
+    """Generate and save a Colab notebook, return file path."""
+    models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
+    if not models:
+        return None
+    weights = []
+    if weights_text.strip():
+        try:
+            weights = [float(w.strip()) for w in weights_text.split(",")]
+        except ValueError:
+            pass
+    densities = []
+    if densities_text.strip():
+        try:
+            densities = [float(d.strip()) for d in densities_text.split(",")]
+        except ValueError:
+            pass
+    quant_types = [q.strip() for q in quant_types_text.split(",") if q.strip()]
+    if not quant_types:
+        quant_types = ["Q5_K_M", "Q4_K_M"]
+    config = MergeConfig(
+        method=method,
+        models=models,
+        base_model=base_model.strip(),
+        weights=weights,
+        densities=densities,
+        tokenizer_source=tokenizer_src.strip(),
+        dtype=dtype,
+        slerp_t=slerp_t,
+        int8_mask=int8_mask,
+        normalize=normalize,
+    )
+    name = output_name.strip() or "ForgeKit-Merged-Model"
+    user = hf_user.strip()
+    nb = generate_merge_notebook(
+        config,
+        output_model_name=name,
+        hf_username=user,
+        include_quantize=inc_quantize,
+        include_deploy=inc_deploy,
+        quant_types=quant_types,
+    )
+    path = os.path.join(tempfile.gettempdir(), f"{name}_merge.ipynb")
+    save_notebook(nb, path)
+    return path
+def search_hf_models(query: str, arch_filter: str, sort_by: str) -> str:
+    """Search HF Hub and return formatted results."""
+    if not query.strip():
+        return "Enter a search query"
+    results = search_models(
+        query=query.strip(),
+        architecture=arch_filter if arch_filter != "Any" else "",
+        limit=15,
+        sort=sort_by.lower(),
+    )
+    if not results:
+        return "No models found"
+    lines = ["| Model | Architecture | Downloads |", "|-------|-------------|-----------|"]
+    for r in results:
+        mid = r.get("model_id", "")
+        mtype = r.get("model_type", "—")
+        dl = r.get("downloads", 0)
+        dl_str = f"{dl:,}" if dl else "—"
+        lines.append(f"| `{mid}` | {mtype} | {dl_str} |")
+    return "\n".join(lines)
+def fetch_model_details(model_id: str) -> str:
+    """Fetch and display detailed model info."""
+    if not model_id.strip():
+        return "Enter a model ID"
+    info = fetch_model_info(model_id.strip())
+    if info.error:
+        return f"❌ {info.error}"
+    return f"""### {info.model_id}
+| Property | Value |
+|----------|-------|
+| **Architecture** | `{info.model_type}` |
+| **Hidden Size** | {info.hidden_size} |
+| **Layers** | {info.num_hidden_layers} |
+| **Vocab Size** | {info.vocab_size:,} |
+| **Intermediate** | {info.intermediate_size} |
+| **Attention Heads** | {info.num_attention_heads} |
+| **KV Heads** | {info.num_key_value_heads} |
+| **Max Position** | {info.max_position_embeddings:,} |
+| **dtype** | {info.torch_dtype} |
+| **Downloads** | {info.downloads:,} |
+| **Likes** | {info.likes} |
+| **Params (est.)** | {info.param_estimate} |
+| **RAM for merge** | {info.ram_estimate_gb} GB |
+| **Gated** | {'Yes' if info.gated else 'No'} |
+| **trust_remote_code** | {'Required' if info.trust_remote_code else 'No'} |"""
+def suggest_base(models_text: str, token: str) -> tuple[str, str]:
+    """Auto-suggest base model and tokenizer from compatibility check."""
+    models = [m.strip() for m in models_text.strip().split("\n") if m.strip()]
+    if len(models) < 2:
+        return "", ""
+    tok = token.strip() if token else None
+    report = check_compatibility(models, token=tok)
+    return report.suggested_base, report.suggested_tokenizer
+# ===== LEADERBOARD DATA =====
+# Seeded with your existing merges
+LEADERBOARD = [
+    {
+        "name": "Qwen2.5CMR-7B", "author": "AIencoder",
+        "method": "DARE-TIES", "base": "Qwen2.5-7B-Instruct",
+        "models": "Coder-7B + Math-7B", "likes": 0,
+        "link": "https://huggingface.co/AIencoder/Qwen2.5CMR",
+    },
+    {
+        "name": "Logic-Coder-7B", "author": "AIencoder",
+        "method": "DARE-TIES", "base": "Mistral-7B",
+        "models": "OpenHermes + CodeInstruct", "likes": 1,
+        "link": "https://huggingface.co/AIencoder/Logic-Coder-7B",
+    },
+    {
+        "name": "HermesMath-7B-TIES", "author": "AIencoder",
+        "method": "TIES", "base": "Mistral-7B",
+        "models": "Hermes + MetaMath", "likes": 1,
+        "link": "https://huggingface.co/AIencoder/HermesMath-7B-TIES",
+    },
+    {
+        "name": "Hermes-2-Pro-GodCoder", "author": "AIencoder",
+        "method": "DARE-TIES", "base": "Mistral-7B",
+        "models": "Hermes-2-Pro + CodeModels", "likes": 1,
+        "link": "https://huggingface.co/AIencoder/Hermes-2-Pro-Mistral-7B-GodCoder",
+    },
+]
+def get_leaderboard() -> str:
+    """Return leaderboard as markdown table."""
+    lines = [
+        "| # | Model | Author | Method | Source Models | Likes |",
+        "|---|-------|--------|--------|---------------|-------|",
+    ]
+    sorted_lb = sorted(LEADERBOARD, key=lambda x: -x["likes"])
+    for i, entry in enumerate(sorted_lb, 1):
+        name = f"[{entry['name']}]({entry['link']})"
+        lines.append(
+            f"| {i} | {name} | {entry['author']} | {entry['method']} | "
+            f"{entry['models']} | {entry['likes']} |"
+        )
+    return "\n".join(lines)
+# ============================================================
+# GRADIO APP
+# ============================================================
+with gr.Blocks(title="ForgeKit — Model Merging Platform") as demo:
+    # ===== HEADER =====
+    gr.HTML("""
+    <div class="forgekit-header">
+        <h1>🔥 ForgeKit</h1>
+        <p>Forge your perfect AI model — no code required</p>
+    </div>
+    """)
+    with gr.Tabs():
+        # =====================================================
+        # TAB 1: MERGE BUILDER
+        # =====================================================
+        with gr.Tab("⚒️ Merge Builder", id="builder"):
+            gr.Markdown("### Build your merge configuration and generate a ready-to-run Colab notebook")
+            with gr.Row():
+                # LEFT COLUMN: Inputs
+                with gr.Column(scale=3):
+                    models_input = gr.Textbox(
+                        label="Models to Merge (one per line)",
+                        placeholder="Qwen/Qwen2.5-Coder-7B-Instruct\nQwen/Qwen2.5-Math-7B-Instruct",
+                        lines=5,
+                    )
+                    hf_token = gr.Textbox(
+                        label="HF Token (optional — for gated models)",
+                        type="password",
+                        placeholder="hf_...",
+                    )
+                    with gr.Row():
+                        check_btn = gr.Button("🔍 Check Compatibility", variant="secondary")
+                        suggest_btn = gr.Button("💡 Auto-Suggest Base", variant="secondary")
+                    compat_status = gr.Textbox(label="Quick Status", interactive=False, max_lines=2)
+                    compat_report = gr.Markdown(label="Compatibility Report")
+                # RIGHT COLUMN: Configuration
+                with gr.Column(scale=3):
+                    method_dd = gr.Dropdown(
+                        choices=list(MERGE_METHODS.keys()),
+                        value="dare_ties",
+                        label="Merge Method",
+                    )
+                    method_info_md = gr.Markdown(
+                        value=f"**DARE-TIES** — {MERGE_METHODS['dare_ties']['description']}"
+                    )
+                    base_model = gr.Textbox(
+                        label="Base Model",
+                        placeholder="Qwen/Qwen2.5-7B-Instruct",
+                    )
+                    tokenizer_src = gr.Textbox(
+                        label="Tokenizer Source",
+                        placeholder="Same as base model (leave blank to auto-fill)",
+                    )
+                    with gr.Row():
+                        weights_input = gr.Textbox(label="Weights (comma-separated)", placeholder="0.5, 0.5")
+                        densities_input = gr.Textbox(label="Densities (comma-separated)", placeholder="0.7, 0.6")
+                    with gr.Row():
+                        preset_dd = gr.Dropdown(
+                            choices=list(PRESETS.keys()),
+                            label="Apply Preset",
+                            scale=2,
+                        )
+                        preset_btn = gr.Button("Apply", variant="secondary", scale=1)
+                    with gr.Row():
+                        dtype_dd = gr.Dropdown(choices=["bfloat16", "float16", "float32"], value="bfloat16", label="dtype")
+                        slerp_t = gr.Slider(0, 1, value=0.5, step=0.05, label="SLERP t", visible=False)
+                    with gr.Row():
+                        int8_mask = gr.Checkbox(label="int8_mask", value=True)
+                        normalize_cb = gr.Checkbox(label="normalize", value=True)
+            gr.Markdown("---")
+            gr.Markdown("### Output")
+            with gr.Row():
+                with gr.Column(scale=3):
+                    yaml_output = gr.Code(label="Generated YAML Config", language="yaml", lines=15)
+                    gen_yaml_btn = gr.Button("📋 Generate YAML", variant="primary", size="lg")
+                with gr.Column(scale=3):
+                    gr.Markdown("#### Notebook Settings")
+                    output_name = gr.Textbox(label="Model Name", placeholder="My-Merged-7B")
+                    hf_username = gr.Textbox(label="HF Username", placeholder="AIencoder")
+                    with gr.Row():
+                        inc_quant = gr.Checkbox(label="Include GGUF Quantization", value=True)
+                        inc_deploy = gr.Checkbox(label="Include HF Deployment", value=True)
+                    quant_types = gr.Textbox(label="Quant Types", value="Q5_K_M, Q4_K_M")
+                    gen_nb_btn = gr.Button("🚀 Generate Colab Notebook", variant="primary", size="lg")
+                    nb_file = gr.File(label="Download Notebook")
+            # === EVENTS ===
+            check_btn.click(
+                check_models, [models_input, hf_token], [compat_report, compat_status]
+            )
+            suggest_btn.click(
+                suggest_base, [models_input, hf_token], [base_model, tokenizer_src]
+            )
+            preset_btn.click(
+                apply_preset, [preset_dd, models_input], [weights_input, densities_input]
+            )
+            gen_yaml_btn.click(
+                generate_config,
+                [models_input, method_dd, base_model, weights_input, densities_input,
+                 tokenizer_src, dtype_dd, slerp_t, int8_mask, normalize_cb],
+                yaml_output,
+            )
+            gen_nb_btn.click(
+                generate_notebook_file,
+                [models_input, method_dd, base_model, weights_input, densities_input,
+                 tokenizer_src, dtype_dd, slerp_t, int8_mask, normalize_cb,
+                 output_name, hf_username, inc_quant, inc_deploy, quant_types],
+                nb_file,
+            )
+            # Method change: show/hide SLERP slider + update description
+            def on_method_change(m):
+                info = MERGE_METHODS.get(m, {})
+                desc = f"**{info.get('name', m)}** — {info.get('description', '')}"
+                show_slerp = m == "slerp"
+                return desc, gr.update(visible=show_slerp)
+            method_dd.change(on_method_change, method_dd, [method_info_md, slerp_t])
+        # =====================================================
+        # TAB 2: MODEL EXPLORER
+        # =====================================================
+        with gr.Tab("🔍 Model Explorer", id="explorer"):
+            gr.Markdown("### Search and discover models on HuggingFace Hub")
+            with gr.Row():
+                search_query = gr.Textbox(label="Search", placeholder="qwen coder instruct", scale=3)
+                arch_filter = gr.Dropdown(
+                    choices=["Any", "llama", "qwen2", "mistral", "gemma2", "phi3", "starcoder2"],
+                    value="Any", label="Architecture", scale=1,
+                )
+                sort_dd = gr.Dropdown(choices=["Downloads", "Likes", "Modified"], value="Downloads", label="Sort", scale=1)
+                search_btn = gr.Button("🔍 Search", variant="primary", scale=1)
+            search_results = gr.Markdown(label="Results")
+            gr.Markdown("---")
+            gr.Markdown("### Model Details")
+            with gr.Row():
+                detail_input = gr.Textbox(label="Model ID", placeholder="Qwen/Qwen2.5-Coder-7B-Instruct", scale=3)
+                detail_btn = gr.Button("📋 Fetch Details", variant="secondary", scale=1)
+            detail_output = gr.Markdown()
+            search_btn.click(search_hf_models, [search_query, arch_filter, sort_dd], search_results)
+            detail_btn.click(fetch_model_details, detail_input, detail_output)
+        # =====================================================
+        # TAB 3: GGUF QUANTIZER
+        # =====================================================
+        with gr.Tab("📦 GGUF Quantizer", id="quantizer"):
+            gr.Markdown("""### Generate a quantization notebook for any HF model
+            Convert any HuggingFace model to GGUF format for use with llama.cpp, Ollama, LM Studio, etc.""")
+            q_model = gr.Textbox(label="Model ID", placeholder="AIencoder/Qwen2.5CMR-7B")
+            q_username = gr.Textbox(label="Your HF Username", placeholder="AIencoder")
+            gr.Markdown("#### Quantization Levels")
+            gr.Markdown("""
+| Type | Size (7B) | Quality | Best For |
+|------|----------|---------|----------|
+| Q8_0 | ~7.5 GB | Best | Maximum quality |
+| Q6_K | ~5.5 GB | Great | Good balance |
+| **Q5_K_M** | **~5 GB** | **Good** | **Recommended** |
+| Q4_K_M | ~4 GB | Decent | Memory-constrained |
+| IQ4_XS | ~3.5 GB | Fair | Extreme compression |
+""")
+            q_types = gr.Textbox(label="Quant Types (comma-separated)", value="Q8_0, Q5_K_M, Q4_K_M")
+            q_btn = gr.Button("📦 Generate Quantization Notebook", variant="primary", size="lg")
+            q_file = gr.File(label="Download Notebook")
+            def gen_quant_notebook(model_id, username, qtypes_text):
+                if not model_id.strip():
+                    return None
+                qtypes = [q.strip() for q in qtypes_text.split(",") if q.strip()]
+                name = model_id.strip().split("/")[-1]
+                config = MergeConfig(method="linear", models=[model_id.strip()])
+                nb = generate_merge_notebook(
+                    config,
+                    output_model_name=name,
+                    hf_username=username.strip(),
+                    include_quantize=True,
+                    include_deploy=False,
+                    quant_types=qtypes,
+                )
+                # Remove merge cells, keep only setup + quantize
+                path = os.path.join(tempfile.gettempdir(), f"{name}_quantize.ipynb")
+                save_notebook(nb, path)
+                return path
+            q_btn.click(gen_quant_notebook, [q_model, q_username, q_types], q_file)
+        # =====================================================
+        # TAB 4: DEPLOY
+        # =====================================================
+        with gr.Tab("🚀 Deploy", id="deploy"):
+            gr.Markdown("""### Deploy your merged model to a HuggingFace Space
+After merging and (optionally) quantizing, deploy a chat interface for your model.""")
+            d_model = gr.Textbox(label="Model Repo ID", placeholder="AIencoder/Qwen2.5CMR-7B")
+            d_type = gr.Dropdown(
+                choices=["Gradio Chat (transformers)", "Docker + llama.cpp (GGUF)"],
+                value="Gradio Chat (transformers)", label="Deployment Type",
+            )
+            d_btn = gr.Button("📋 Generate Deployment Files", variant="primary")
+            d_output = gr.Code(label="app.py", language="python", lines=20)
+            d_readme = gr.Code(label="README.md (Space metadata)", language="markdown", lines=8)
+            def gen_deploy(model_id, deploy_type):
+                mid = model_id.strip()
+                if not mid:
+                    return "# Enter a model ID first", ""
+                if "Gradio" in deploy_type:
+                    app = f'''import gradio as gr
+from transformers import AutoTokenizer, AutoModelForCausalLM, TextIteratorStreamer
+import torch
+from threading import Thread
+MODEL_ID = "{mid}"
+tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    MODEL_ID, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
+)
+def chat(message, history):
+    messages = []
+    for h in history:
+        messages.append({{"role": "user", "content": h[0]}})
+        if h[1]:
+            messages.append({{"role": "assistant", "content": h[1]}})
+    messages.append({{"role": "user", "content": message}})
+    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+    inputs = tokenizer(text, return_tensors="pt").to(model.device)
+    streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
+    thread = Thread(target=model.generate, kwargs={{
+        **inputs, "max_new_tokens": 512, "streamer": streamer,
+        "do_sample": True, "temperature": 0.7,
+    }})
+    thread.start()
+    response = ""
+    for token in streamer:
+        response += token
+        yield response
+demo = gr.ChatInterface(chat, title="{mid.split('/')[-1]}", description="Merged with ForgeKit")
+demo.launch()'''
+                    readme = f"""---
+title: {mid.split('/')[-1]} Chat
+emoji: 🔥
+colorFrom: amber
+colorTo: orange
+sdk: gradio
+sdk_version: 5.12.0
+app_file: app.py
+pinned: false
+license: apache-2.0
+---"""
+                else:
+                    app = f'''# Docker deployment with llama.cpp
+# Dockerfile for serving GGUF models
+FROM ghcr.io/ggerganov/llama.cpp:server
+# Download the GGUF model
+ADD https://huggingface.co/{mid}/resolve/main/*Q5_K_M*.gguf /models/model.gguf
+EXPOSE 7860
+CMD ["/llama-server", \\
+     "--model", "/models/model.gguf", \\
+     "--host", "0.0.0.0", \\
+     "--port", "7860", \\
+     "--ctx-size", "4096", \\
+     "--n-gpu-layers", "99"]'''
+                    readme = f"""---
+title: {mid.split('/')[-1]}
+emoji: 🔥
+colorFrom: amber
+colorTo: orange
+sdk: docker
+pinned: false
+license: apache-2.0
+---"""
+                return app, readme
+            d_btn.click(gen_deploy, [d_model, d_type], [d_output, d_readme])
+        # =====================================================
+        # TAB 5: LEADERBOARD
+        # =====================================================
+        with gr.Tab("🏆 Leaderboard", id="leaderboard"):
+            gr.Markdown("""### Community Merge Leaderboard
+            See what others have built with ForgeKit. Submit your own merge to get featured!""")
+            lb_md = gr.Markdown(value=get_leaderboard())
+            lb_refresh = gr.Button("🔄 Refresh", variant="secondary")
+            lb_refresh.click(lambda: get_leaderboard(), outputs=lb_md)
+            gr.Markdown("---")
+            gr.Markdown("### Submit Your Merge")
+            with gr.Row():
+                sub_name = gr.Textbox(label="Model Name", placeholder="My-Awesome-Merge-7B")
+                sub_author = gr.Textbox(label="Author", placeholder="Your HF username")
+                sub_method = gr.Textbox(label="Merge Method", placeholder="DARE-TIES")
+            with gr.Row():
+                sub_models = gr.Textbox(label="Source Models (short)", placeholder="Coder-7B + Math-7B")
+                sub_link = gr.Textbox(label="HF Model Link", placeholder="https://huggingface.co/...")
+            sub_btn = gr.Button("📤 Submit", variant="primary")
+            sub_status = gr.Markdown()
+            def submit_merge(name, author, method, models, link):
+                if not all([name, author, method, models, link]):
+                    return "⚠️ Please fill in all fields"
+                LEADERBOARD.append({
+                    "name": name, "author": author, "method": method,
+                    "base": "", "models": models, "likes": 0, "link": link,
+                })
+                return f"✅ **{name}** submitted! It will appear on the leaderboard."
+            sub_btn.click(submit_merge, [sub_name, sub_author, sub_method, sub_models, sub_link], sub_status)
+    # ===== FOOTER =====
+    gr.Markdown("""
+    ---
+    <center>
+    **ForgeKit** v0.1.0 — Built by [AIencoder](https://huggingface.co/AIencoder) | [Portfolio](https://aiencoder-portfolio.static.hf.space) | [GitHub](https://github.com/Ary5272)
+    </center>
+    """)
+if __name__ == "__main__":
+    demo.launch(theme=theme, css=CSS)

compatibility.py ADDED Viewed

	@@ -0,0 +1,321 @@

+"""Architecture compatibility checker for model merging."""
+from dataclasses import dataclass, field
+from typing import Optional
+from .model_info import ModelInfo, fetch_model_info
+@dataclass
+class CompatibilityReport:
+    """Result of compatibility checking between models."""
+    compatible: bool = True
+    errors: list[str] = field(default_factory=list)
+    warnings: list[str] = field(default_factory=list)
+    suggestions: list[str] = field(default_factory=list)
+    models_info: list[ModelInfo] = field(default_factory=list)
+    suggested_base: str = ""
+    suggested_tokenizer: str = ""
+    architecture: str = ""
+    merge_methods_available: list[str] = field(default_factory=list)
+    estimated_ram_gb: float = 0.0
+    estimated_merge_time: str = ""
+    @property
+    def status_emoji(self) -> str:
+        if not self.compatible:
+            return "❌"
+        elif self.warnings:
+            return "⚠️"
+        return "✅"
+    @property
+    def status_text(self) -> str:
+        if not self.compatible:
+            return "Incompatible — cannot merge"
+        elif self.warnings:
+            return "Compatible with warnings"
+        return "Fully compatible"
+    def to_markdown(self) -> str:
+        """Generate a formatted markdown report."""
+        lines = []
+        # Header
+        lines.append(f"## {self.status_emoji} Compatibility Report")
+        lines.append("")
+        if self.architecture:
+            lines.append(f"**Architecture:** `{self.architecture}`")
+            lines.append("")
+        # Errors
+        if self.errors:
+            lines.append("### ❌ Errors")
+            for e in self.errors:
+                lines.append(f"- {e}")
+            lines.append("")
+        # Warnings
+        if self.warnings:
+            lines.append("### ⚠️ Warnings")
+            for w in self.warnings:
+                lines.append(f"- {w}")
+            lines.append("")
+        # Model details table
+        if self.models_info:
+            lines.append("### Model Details")
+            lines.append("| Model | Type | Hidden | Layers | Vocab | Params |")
+            lines.append("|-------|------|--------|--------|-------|--------|")
+            for m in self.models_info:
+                name = m.display_name
+                if len(name) > 35:
+                    name = name[:32] + "..."
+                lines.append(
+                    f"| {name} | `{m.model_type}` | {m.hidden_size} | "
+                    f"{m.num_hidden_layers} | {m.vocab_size} | {m.param_estimate} |"
+                )
+            lines.append("")
+        # Suggestions
+        if self.suggestions:
+            lines.append("### 💡 Suggestions")
+            for s in self.suggestions:
+                lines.append(f"- {s}")
+            lines.append("")
+        # Merge methods
+        if self.merge_methods_available:
+            methods = ", ".join(f"`{m}`" for m in self.merge_methods_available)
+            lines.append(f"**Available merge methods:** {methods}")
+            lines.append("")
+        # Resource estimates
+        if self.estimated_ram_gb > 0:
+            lines.append(f"**Estimated RAM:** {self.estimated_ram_gb} GB")
+            lines.append(f"**Estimated time:** {self.estimated_merge_time}")
+            colab_tier = "Standard" if self.estimated_ram_gb <= 12 else "High-RAM" if self.estimated_ram_gb <= 48 else "A100 (Colab Pro+)"
+            lines.append(f"**Recommended Colab:** {colab_tier}")
+            lines.append("")
+        if self.suggested_base:
+            lines.append(f"**Suggested base model:** `{self.suggested_base}`")
+        if self.suggested_tokenizer:
+            lines.append(f"**Suggested tokenizer:** `{self.suggested_tokenizer}`")
+        return "\n".join(lines)
+def check_compatibility(
+    model_ids: list[str],
+    token: Optional[str] = None,
+) -> CompatibilityReport:
+    """Check if a list of models are compatible for merging.
+    Args:
+        model_ids: List of HuggingFace model IDs
+        token: Optional HF API token for gated models
+    Returns:
+        CompatibilityReport with detailed analysis
+    """
+    report = CompatibilityReport()
+    # Validate input
+    if len(model_ids) < 2:
+        report.compatible = False
+        report.errors.append("At least 2 models are required for merging.")
+        return report
+    if len(model_ids) > 10:
+        report.warnings.append("Merging more than 10 models is unusual and may produce poor results.")
+    # Fetch all model info
+    for mid in model_ids:
+        mid = mid.strip()
+        if not mid:
+            continue
+        info = fetch_model_info(mid, token=token)
+        report.models_info.append(info)
+        if info.error:
+            if info.gated:
+                report.warnings.append(f"`{mid}`: Gated model — provide HF token to verify compatibility")
+            else:
+                report.compatible = False
+                report.errors.append(f"`{mid}`: {info.error}")
+    # If we couldn't fetch any models, bail
+    valid_models = [m for m in report.models_info if not m.error]
+    if len(valid_models) < 2:
+        report.compatible = False
+        if not report.errors:
+            report.errors.append("Could not fetch enough model configs to verify compatibility.")
+        return report
+    # === ARCHITECTURE CHECKS ===
+    # 1. model_type must match
+    types = set(m.model_type for m in valid_models)
+    if len(types) > 1:
+        report.compatible = False
+        report.errors.append(
+            f"Architecture mismatch! Found: {', '.join(f'`{t}`' for t in types)}. "
+            f"All models must share the same architecture to merge."
+        )
+        return report
+    report.architecture = valid_models[0].model_type
+    # 2. hidden_size must match
+    hidden_sizes = set(m.hidden_size for m in valid_models if m.hidden_size > 0)
+    if len(hidden_sizes) > 1:
+        report.compatible = False
+        report.errors.append(
+            f"Hidden size mismatch: {', '.join(str(s) for s in hidden_sizes)}. "
+            f"Models must have the same hidden dimension."
+        )
+    # 3. intermediate_size must match (for most methods)
+    inter_sizes = set(m.intermediate_size for m in valid_models if m.intermediate_size > 0)
+    if len(inter_sizes) > 1:
+        report.compatible = False
+        report.errors.append(
+            f"Intermediate size mismatch: {', '.join(str(s) for s in inter_sizes)}. "
+            f"Required for DARE-TIES, SLERP, and Linear methods."
+        )
+    # 4. num_hidden_layers — warn if different
+    layer_counts = set(m.num_hidden_layers for m in valid_models if m.num_hidden_layers > 0)
+    if len(layer_counts) > 1:
+        report.warnings.append(
+            f"Layer count differs: {', '.join(str(l) for l in layer_counts)}. "
+            f"Passthrough/Frankenmerge can handle this, but DARE-TIES/SLERP/Linear require matching layers."
+        )
+    # 5. vocab_size — warn if different
+    vocab_sizes = set(m.vocab_size for m in valid_models if m.vocab_size > 0)
+    if len(vocab_sizes) > 1:
+        report.warnings.append(
+            f"Vocabulary size differs: {', '.join(str(v) for v in vocab_sizes)}. "
+            f"Use `tokenizer_source` to specify which tokenizer to keep."
+        )
+    # 6. num_attention_heads / num_key_value_heads
+    head_counts = set(m.num_attention_heads for m in valid_models if m.num_attention_heads > 0)
+    kv_head_counts = set(m.num_key_value_heads for m in valid_models if m.num_key_value_heads > 0)
+    if len(head_counts) > 1:
+        report.compatible = False
+        report.errors.append(
+            f"Attention head count mismatch: {', '.join(str(h) for h in head_counts)}."
+        )
+    if len(kv_head_counts) > 1:
+        report.warnings.append(
+            f"KV head count differs: {', '.join(str(h) for h in kv_head_counts)}. "
+            f"This may cause issues with GQA models."
+        )
+    # 7. trust_remote_code warning
+    needs_trust = [m.model_id for m in valid_models if m.trust_remote_code]
+    if needs_trust:
+        report.warnings.append(
+            f"Models requiring `trust_remote_code=True`: "
+            f"{', '.join(f'`{m}`' for m in needs_trust)}"
+        )
+    # === SUGGESTIONS ===
+    # Suggest base model (most downloaded or original base if detectable)
+    if valid_models:
+        # Prefer instruct/base versions, then most downloaded
+        base_candidates = sorted(
+            valid_models,
+            key=lambda m: (
+                "instruct" in m.model_id.lower() and "code" not in m.model_id.lower(),
+                -m.downloads,
+            ),
+        )
+        report.suggested_base = base_candidates[0].model_id
+        report.suggestions.append(f"Use `{report.suggested_base}` as the base model")
+    # Suggest tokenizer source (largest vocab)
+    if vocab_sizes and len(vocab_sizes) > 1:
+        largest_vocab_model = max(valid_models, key=lambda m: m.vocab_size)
+        report.suggested_tokenizer = largest_vocab_model.model_id
+        report.suggestions.append(
+            f"Use tokenizer from `{report.suggested_tokenizer}` (largest vocab: {largest_vocab_model.vocab_size})"
+        )
+    elif valid_models:
+        report.suggested_tokenizer = report.suggested_base
+    # === AVAILABLE MERGE METHODS ===
+    n = len(valid_models)
+    methods = []
+    if report.compatible:
+        # Linear always works if architectures match
+        methods.append("linear")
+        # DARE-TIES needs matching layers
+        if len(layer_counts) <= 1:
+            methods.append("dare_ties")
+            methods.append("ties")
+        # SLERP only for 2 models
+        if n == 2 and len(layer_counts) <= 1:
+            methods.append("slerp")
+        # Task arithmetic needs a base
+        methods.append("task_arithmetic")
+    # Passthrough works even with different layer counts
+    methods.append("passthrough")
+    report.merge_methods_available = methods
+    # === RESOURCE ESTIMATES ===
+    max_size = max((m.size_bytes for m in valid_models if m.size_bytes > 0), default=0)
+    if max_size > 0:
+        # Merging needs roughly: all models loaded + output
+        total_model_bytes = sum(m.size_bytes for m in valid_models if m.size_bytes > 0)
+        # Rule of thumb: need models + 50% overhead
+        ram_needed = (total_model_bytes + max_size) * 1.3
+        report.estimated_ram_gb = round(ram_needed / (1024**3), 1)
+        # Time estimate based on total size
+        total_gb = total_model_bytes / (1024**3)
+        if total_gb < 10:
+            report.estimated_merge_time = "5-15 minutes"
+        elif total_gb < 30:
+            report.estimated_merge_time = "15-30 minutes"
+        elif total_gb < 60:
+            report.estimated_merge_time = "30-60 minutes"
+        else:
+            report.estimated_merge_time = "1-2+ hours"
+    return report
+def quick_check(model_ids: list[str], token: Optional[str] = None) -> str:
+    """Quick one-line compatibility check.
+    Returns a formatted string like:
+    "✅ Compatible (qwen2) | 3 models | ~32GB RAM | DARE-TIES, SLERP, Linear"
+    """
+    report = check_compatibility(model_ids, token=token)
+    if not report.compatible:
+        errors = "; ".join(report.errors[:2])
+        return f"❌ {errors}"
+    methods = ", ".join(report.merge_methods_available[:3])
+    parts = [
+        f"{report.status_emoji} {report.status_text}",
+        f"Architecture: {report.architecture}",
+        f"{len(report.models_info)} models",
+    ]
+    if report.estimated_ram_gb > 0:
+        parts.append(f"~{report.estimated_ram_gb}GB RAM")
+    parts.append(f"Methods: {methods}")
+    return " | ".join(parts)

config_generator.py ADDED Viewed

	@@ -0,0 +1,328 @@

+"""Merge configuration YAML generator with presets and validation."""
+from dataclasses import dataclass, field
+from typing import Optional
+import yaml
+# ===== MERGE METHOD DEFINITIONS =====
+MERGE_METHODS = {
+    "dare_ties": {
+        "name": "DARE-TIES",
+        "description": "Drop And REscale with TIES — trims low-magnitude parameters and resolves sign conflicts. Best for combining 2+ specialist models.",
+        "min_models": 2,
+        "max_models": 10,
+        "needs_base": True,
+        "params": ["weight", "density"],
+        "global_params": ["int8_mask", "normalize"],
+        "supports_slices": True,
+    },
+    "ties": {
+        "name": "TIES",
+        "description": "Trim, Elect Sign, Merge — resolves parameter interference between models. Similar to DARE-TIES but without the drop step.",
+        "min_models": 2,
+        "max_models": 10,
+        "needs_base": True,
+        "params": ["weight", "density"],
+        "global_params": ["int8_mask", "normalize"],
+        "supports_slices": True,
+    },
+    "slerp": {
+        "name": "SLERP",
+        "description": "Spherical Linear Interpolation — smoothly blends two models along a curved path in weight space. Best for two-model merges.",
+        "min_models": 2,
+        "max_models": 2,
+        "needs_base": False,
+        "params": [],
+        "global_params": ["t"],
+        "supports_slices": True,
+    },
+    "linear": {
+        "name": "Linear",
+        "description": "Simple weighted average of model parameters. Fast and predictable baseline.",
+        "min_models": 2,
+        "max_models": 10,
+        "needs_base": False,
+        "params": ["weight"],
+        "global_params": ["normalize"],
+        "supports_slices": True,
+    },
+    "task_arithmetic": {
+        "name": "Task Arithmetic",
+        "description": "Add or subtract task vectors from a base model. Use negative weights to remove capabilities.",
+        "min_models": 1,
+        "max_models": 10,
+        "needs_base": True,
+        "params": ["weight"],
+        "global_params": [],
+        "supports_slices": False,
+    },
+    "passthrough": {
+        "name": "Passthrough (Frankenmerge)",
+        "description": "Stack layers from different models. Can create larger models from smaller ones. Supports different layer counts.",
+        "min_models": 1,
+        "max_models": 10,
+        "needs_base": False,
+        "params": [],
+        "global_params": [],
+        "supports_slices": True,
+        "requires_slices": True,
+    },
+}
+# ===== PRESETS =====
+@dataclass
+class MergePreset:
+    name: str
+    description: str
+    method: str
+    weight_strategy: str  # "equal", "first_dominant", "last_dominant", "auto_detect"
+    def apply(self, model_ids: list[str]) -> tuple[list[float], list[float]]:
+        """Generate weights and densities for given models."""
+        n = len(model_ids)
+        if n == 0:
+            return [], []
+        if self.weight_strategy == "equal":
+            weights = [round(1.0 / n, 3)] * n
+            densities = [0.6] * n
+        elif self.weight_strategy == "first_dominant":
+            weights = [0.6] + [round(0.4 / (n - 1), 3)] * (n - 1) if n > 1 else [1.0]
+            densities = [0.7] + [0.5] * (n - 1)
+        elif self.weight_strategy == "last_dominant":
+            weights = [round(0.4 / (n - 1), 3)] * (n - 1) + [0.6] if n > 1 else [1.0]
+            densities = [0.5] * (n - 1) + [0.7]
+        elif self.weight_strategy == "auto_detect":
+            weights, densities = _auto_detect_weights(model_ids)
+        else:
+            weights = [round(1.0 / n, 3)] * n
+            densities = [0.6] * n
+        return weights, densities
+def _auto_detect_weights(model_ids: list[str]) -> tuple[list[float], list[float]]:
+    """Auto-detect optimal weights based on model names/tags."""
+    n = len(model_ids)
+    weights = []
+    densities = []
+    for mid in model_ids:
+        name = mid.lower()
+        if "code" in name or "coder" in name:
+            weights.append(0.5)
+            densities.append(0.7)
+        elif "math" in name:
+            weights.append(0.4)
+            densities.append(0.6)
+        elif "instruct" in name and "code" not in name:
+            weights.append(0.3)
+            densities.append(0.5)
+        else:
+            weights.append(0.3)
+            densities.append(0.5)
+    # Normalize weights to sum to 1
+    total = sum(weights)
+    if total > 0:
+        weights = [round(w / total, 3) for w in weights]
+    return weights, densities
+PRESETS = {
+    "equal": MergePreset("Equal", "Equal weights for all models", "dare_ties", "equal"),
+    "first_dominant": MergePreset("First Model Dominant", "Prioritize the first model", "dare_ties", "first_dominant"),
+    "last_dominant": MergePreset("Last Model Dominant", "Prioritize the last model", "dare_ties", "last_dominant"),
+    "coding_focus": MergePreset("Coding Focus", "Higher weight for code-related models", "dare_ties", "auto_detect"),
+    "balanced_slerp": MergePreset("Balanced SLERP", "50/50 interpolation between two models", "slerp", "equal"),
+}
+# ===== CONFIG GENERATION =====
+@dataclass
+class MergeConfig:
+    """Complete merge configuration."""
+    method: str = "dare_ties"
+    models: list[str] = field(default_factory=list)
+    base_model: str = ""
+    weights: list[float] = field(default_factory=list)
+    densities: list[float] = field(default_factory=list)
+    tokenizer_source: str = ""
+    dtype: str = "bfloat16"
+    # Method-specific params
+    slerp_t: float = 0.5
+    int8_mask: bool = True
+    normalize: bool = True
+    # Passthrough/slice params
+    slices: list[dict] = field(default_factory=list)
+    # Output
+    output_name: str = ""
+    def validate(self) -> list[str]:
+        """Validate the configuration. Returns list of error messages."""
+        errors = []
+        method_info = MERGE_METHODS.get(self.method)
+        if not method_info:
+            errors.append(f"Unknown merge method: {self.method}")
+            return errors
+        n = len(self.models)
+        if n < method_info["min_models"]:
+            errors.append(f"{method_info['name']} requires at least {method_info['min_models']} models")
+        if n > method_info["max_models"]:
+            errors.append(f"{method_info['name']} supports at most {method_info['max_models']} models")
+        if method_info["needs_base"] and not self.base_model:
+            errors.append(f"{method_info['name']} requires a base_model")
+        if "weight" in method_info["params"]:
+            if self.weights and len(self.weights) != n:
+                errors.append(f"Expected {n} weights, got {len(self.weights)}")
+            if self.weights and any(w < -1 or w > 2 for w in self.weights):
+                errors.append("Weights should be between -1 and 2")
+        if "density" in method_info["params"]:
+            if self.densities and len(self.densities) != n:
+                errors.append(f"Expected {n} densities, got {len(self.densities)}")
+            if self.densities and any(d < 0 or d > 1 for d in self.densities):
+                errors.append("Densities must be between 0 and 1")
+        if self.method == "slerp" and (self.slerp_t < 0 or self.slerp_t > 1):
+            errors.append("SLERP t parameter must be between 0 and 1")
+        if method_info.get("requires_slices") and not self.slices:
+            errors.append(f"{method_info['name']} requires slice definitions")
+        return errors
+def generate_yaml(config: MergeConfig) -> str:
+    """Generate mergekit-compatible YAML configuration.
+    Args:
+        config: MergeConfig with all parameters
+    Returns:
+        YAML string ready for mergekit
+    """
+    errors = config.validate()
+    if errors:
+        return f"# VALIDATION ERRORS:\n" + "\n".join(f"# - {e}" for e in errors)
+    method_info = MERGE_METHODS[config.method]
+    doc = {}
+    # Passthrough uses slices format
+    if config.method == "passthrough":
+        doc["slices"] = config.slices or _default_slices(config)
+        doc["merge_method"] = config.method
+        doc["dtype"] = config.dtype
+        return yaml.dump(doc, default_flow_style=False, sort_keys=False)
+    # Standard methods
+    doc["merge_method"] = config.method
+    if method_info["needs_base"]:
+        doc["base_model"] = config.base_model
+    # Models with parameters
+    if config.method == "slerp":
+        doc["models"] = [{"model": m} for m in config.models]
+        doc["parameters"] = {"t": config.slerp_t}
+    else:
+        models_list = []
+        for i, model_id in enumerate(config.models):
+            entry = {"model": model_id}
+            params = {}
+            if "weight" in method_info["params"] and config.weights:
+                params["weight"] = config.weights[i]
+            if "density" in method_info["params"] and config.densities:
+                params["density"] = config.densities[i]
+            if params:
+                entry["parameters"] = params
+            models_list.append(entry)
+        doc["models"] = models_list
+    # Global parameters
+    global_params = {}
+    if "int8_mask" in method_info.get("global_params", []):
+        global_params["int8_mask"] = config.int8_mask
+    if "normalize" in method_info.get("global_params", []):
+        global_params["normalize"] = config.normalize
+    if global_params:
+        doc["parameters"] = global_params
+    doc["dtype"] = config.dtype
+    if config.tokenizer_source:
+        doc["tokenizer_source"] = config.tokenizer_source
+    return yaml.dump(doc, default_flow_style=False, sort_keys=False)
+def _default_slices(config: MergeConfig) -> list[dict]:
+    """Generate default slice config for passthrough merges."""
+    slices = []
+    for model_id in config.models:
+        slices.append({
+            "sources": [{"model": model_id, "layer_range": [0, 32]}]
+        })
+    return slices
+def generate_from_preset(
+    preset_name: str,
+    model_ids: list[str],
+    base_model: str = "",
+    tokenizer_source: str = "",
+    dtype: str = "bfloat16",
+) -> str:
+    """Quick config generation from a preset name.
+    Args:
+        preset_name: Key from PRESETS dict
+        model_ids: List of model IDs to merge
+        base_model: Base model for methods that need one
+        tokenizer_source: Which model's tokenizer to use
+        dtype: Data type for merge
+    Returns:
+        YAML string
+    """
+    preset = PRESETS.get(preset_name)
+    if not preset:
+        return f"# Unknown preset: {preset_name}\n# Available: {', '.join(PRESETS.keys())}"
+    weights, densities = preset.apply(model_ids)
+    config = MergeConfig(
+        method=preset.method,
+        models=model_ids,
+        base_model=base_model or (model_ids[0] if model_ids else ""),
+        weights=weights,
+        densities=densities,
+        tokenizer_source=tokenizer_source or base_model or (model_ids[0] if model_ids else ""),
+        dtype=dtype,
+    )
+    return generate_yaml(config)
+def get_method_info(method: str) -> dict:
+    """Get human-readable info about a merge method."""
+    return MERGE_METHODS.get(method, {"name": "Unknown", "description": "Unknown method"})

model_info.py ADDED Viewed

	@@ -0,0 +1,307 @@

+"""HuggingFace Hub API wrapper for model discovery and info retrieval."""
+import json
+import time
+from dataclasses import dataclass, field
+from typing import Optional
+from functools import lru_cache
+import requests
+HF_API = "https://huggingface.co/api"
+_session = requests.Session()
+_session.headers.update({"Accept": "application/json"})
+# Simple in-memory cache with TTL
+_cache: dict[str, tuple[float, any]] = {}
+CACHE_TTL = 300  # 5 minutes
+def _cached_get(url: str, token: Optional[str] = None, ttl: int = CACHE_TTL) -> dict:
+    """GET with caching and rate-limit handling."""
+    now = time.time()
+    if url in _cache and (now - _cache[url][0]) < ttl:
+        return _cache[url][1]
+    headers = {}
+    if token:
+        headers["Authorization"] = f"Bearer {token}"
+    resp = _session.get(url, headers=headers, timeout=15)
+    if resp.status_code == 429:
+        retry = int(resp.headers.get("Retry-After", 5))
+        time.sleep(retry)
+        resp = _session.get(url, headers=headers, timeout=15)
+    resp.raise_for_status()
+    data = resp.json()
+    _cache[url] = (now, data)
+    return data
+@dataclass
+class ModelInfo:
+    """Parsed model information from HF Hub."""
+    model_id: str
+    model_type: str = "unknown"
+    architectures: list[str] = field(default_factory=list)
+    vocab_size: int = 0
+    hidden_size: int = 0
+    intermediate_size: int = 0
+    num_hidden_layers: int = 0
+    num_attention_heads: int = 0
+    num_key_value_heads: int = 0
+    max_position_embeddings: int = 0
+    torch_dtype: str = "unknown"
+    pipeline_tag: str = ""
+    tags: list[str] = field(default_factory=list)
+    downloads: int = 0
+    likes: int = 0
+    size_bytes: int = 0
+    gated: bool = False
+    private: bool = False
+    trust_remote_code: bool = False
+    error: Optional[str] = None
+    @property
+    def param_estimate(self) -> str:
+        """Rough parameter count estimate based on architecture."""
+        if self.size_bytes > 0:
+            # Rough: model files in bf16 ≈ 2 bytes per param
+            params = self.size_bytes / 2
+            if params > 1e9:
+                return f"{params/1e9:.1f}B"
+            elif params > 1e6:
+                return f"{params/1e6:.0f}M"
+        return "unknown"
+    @property
+    def arch_signature(self) -> str:
+        """Unique signature for architecture matching."""
+        return f"{self.model_type}|{self.hidden_size}|{self.intermediate_size}"
+    @property
+    def display_name(self) -> str:
+        """Short display name (without org prefix)."""
+        return self.model_id.split("/")[-1] if "/" in self.model_id else self.model_id
+    @property
+    def ram_estimate_gb(self) -> float:
+        """Estimated RAM needed for merging (roughly 2.5x model size for bf16 merge)."""
+        if self.size_bytes > 0:
+            return round(self.size_bytes * 2.5 / (1024**3), 1)
+        return 0.0
+    def to_dict(self) -> dict:
+        return {
+            "model_id": self.model_id,
+            "model_type": self.model_type,
+            "architectures": self.architectures,
+            "vocab_size": self.vocab_size,
+            "hidden_size": self.hidden_size,
+            "intermediate_size": self.intermediate_size,
+            "num_hidden_layers": self.num_hidden_layers,
+            "num_attention_heads": self.num_attention_heads,
+            "torch_dtype": self.torch_dtype,
+            "pipeline_tag": self.pipeline_tag,
+            "downloads": self.downloads,
+            "likes": self.likes,
+            "param_estimate": self.param_estimate,
+            "ram_estimate_gb": self.ram_estimate_gb,
+            "gated": self.gated,
+            "private": self.private,
+        }
+def fetch_model_info(model_id: str, token: Optional[str] = None) -> ModelInfo:
+    """Fetch comprehensive model information from HF Hub.
+    Args:
+        model_id: Full model ID (e.g., "Qwen/Qwen2.5-Coder-7B-Instruct")
+        token: Optional HF API token for gated/private models
+    Returns:
+        ModelInfo dataclass with all available information
+    """
+    info = ModelInfo(model_id=model_id)
+    # Fetch main model info
+    try:
+        data = _cached_get(f"{HF_API}/models/{model_id}", token=token)
+    except requests.exceptions.HTTPError as e:
+        if e.response.status_code == 401:
+            info.error = "Gated or private model — HF token required"
+            info.gated = True
+        elif e.response.status_code == 404:
+            info.error = f"Model not found: {model_id}"
+        else:
+            info.error = f"API error: {e.response.status_code}"
+        return info
+    except Exception as e:
+        info.error = f"Connection error: {str(e)}"
+        return info
+    # Parse basic metadata
+    info.pipeline_tag = data.get("pipeline_tag", "")
+    info.tags = data.get("tags", [])
+    info.downloads = data.get("downloads", 0)
+    info.likes = data.get("likes", 0)
+    info.gated = data.get("gated", False) not in (False, None)
+    info.private = data.get("private", False)
+    # Parse config (architecture details)
+    config = data.get("config", {})
+    if config:
+        info.model_type = config.get("model_type", "unknown")
+        info.architectures = config.get("architectures", [])
+    # Fetch full config.json for detailed architecture info
+    # (the API endpoint only returns basic config fields)
+    try:
+        full_config = _cached_get(
+            f"https://huggingface.co/{model_id}/resolve/main/config.json",
+            token=token,
+        )
+        info.model_type = full_config.get("model_type", info.model_type)
+        info.architectures = full_config.get("architectures", info.architectures)
+        info.vocab_size = full_config.get("vocab_size", 0)
+        info.hidden_size = full_config.get("hidden_size", 0)
+        info.intermediate_size = full_config.get("intermediate_size", 0)
+        info.num_hidden_layers = full_config.get("num_hidden_layers", 0)
+        info.num_attention_heads = full_config.get("num_attention_heads", 0)
+        info.num_key_value_heads = full_config.get("num_key_value_heads", 0)
+        info.max_position_embeddings = full_config.get("max_position_embeddings", 0)
+        info.torch_dtype = full_config.get("torch_dtype", "unknown")
+        if "auto_map" in full_config:
+            info.trust_remote_code = True
+    except Exception:
+        # Fall back to basic config from API
+        if config:
+            info.vocab_size = config.get("vocab_size", 0)
+            info.hidden_size = config.get("hidden_size", 0)
+        else:
+            info.error = "Could not fetch config.json — model may need trust_remote_code=True"
+            info.trust_remote_code = True
+    # Estimate total model size from siblings (files)
+    siblings = data.get("siblings", [])
+    total_size = 0
+    for f in siblings:
+        fname = f.get("rfilename", "")
+        size = f.get("size", 0) or 0
+        # Count only model weight files
+        if any(fname.endswith(ext) for ext in
+               [".safetensors", ".bin", ".pt", ".pth", ".gguf"]):
+            total_size += size
+    info.size_bytes = total_size
+    return info
+def search_models(
+    query: str = "",
+    author: str = "",
+    architecture: str = "",
+    limit: int = 20,
+    sort: str = "downloads",
+    token: Optional[str] = None,
+) -> list[dict]:
+    """Search HuggingFace Hub for models.
+    Args:
+        query: Search query string
+        author: Filter by author/organization
+        architecture: Filter by model_type (e.g., "llama", "qwen2")
+        limit: Max results to return
+        sort: Sort by "downloads", "likes", "created", "modified"
+        token: Optional HF API token
+    Returns:
+        List of dicts with basic model info
+    """
+    params = {
+        "limit": min(limit, 100),
+        "sort": sort,
+        "direction": -1,
+        "config": True,
+    }
+    if query:
+        params["search"] = query
+    if author:
+        params["author"] = author
+    url = f"{HF_API}/models"
+    try:
+        data = _cached_get(
+            f"{url}?{'&'.join(f'{k}={v}' for k, v in params.items())}",
+            token=token,
+            ttl=60,  # shorter cache for search
+        )
+    except Exception as e:
+        return [{"error": str(e)}]
+    results = []
+    for m in data:
+        config = m.get("config", {}) or {}
+        model_type = config.get("model_type", "")
+        # Filter by architecture if specified
+        if architecture and model_type.lower() != architecture.lower():
+            continue
+        results.append({
+            "model_id": m.get("modelId", ""),
+            "model_type": model_type,
+            "pipeline_tag": m.get("pipeline_tag", ""),
+            "downloads": m.get("downloads", 0),
+            "likes": m.get("likes", 0),
+            "tags": m.get("tags", [])[:5],
+        })
+    return results[:limit]
+def get_popular_base_models(architecture: str = "", token: Optional[str] = None) -> list[dict]:
+    """Get popular base models for a given architecture type.
+    Useful for suggesting base_model in merge configs.
+    """
+    # Common base models by architecture
+    known_bases = {
+        "llama": [
+            "meta-llama/Llama-3.1-8B-Instruct",
+            "meta-llama/Llama-3.1-70B-Instruct",
+            "meta-llama/Llama-2-7b-hf",
+        ],
+        "mistral": [
+            "mistralai/Mistral-7B-Instruct-v0.3",
+            "mistralai/Mixtral-8x7B-Instruct-v0.1",
+        ],
+        "qwen2": [
+            "Qwen/Qwen2.5-7B-Instruct",
+            "Qwen/Qwen2.5-14B-Instruct",
+            "Qwen/Qwen2.5-3B-Instruct",
+            "Qwen/Qwen2.5-72B-Instruct",
+        ],
+        "gemma2": [
+            "google/gemma-2-9b-it",
+            "google/gemma-2-27b-it",
+        ],
+        "phi3": [
+            "microsoft/Phi-3-mini-4k-instruct",
+            "microsoft/Phi-3-medium-4k-instruct",
+        ],
+    }
+    if architecture.lower() in known_bases:
+        return [{"model_id": m} for m in known_bases[architecture.lower()]]
+    # Fallback: search for popular instruct models
+    return search_models(
+        query=f"{architecture} instruct",
+        limit=5,
+        sort="downloads",
+        token=token,
+    )

notebook_generator.py ADDED Viewed

	@@ -0,0 +1,484 @@

+"""Google Colab notebook generator for model merging, quantization, and deployment."""
+import json
+from typing import Optional
+from .config_generator import MergeConfig, generate_yaml, MERGE_METHODS
+def _cell(source: str, cell_type: str = "code") -> dict:
+    """Create a notebook cell."""
+    return {
+        "cell_type": cell_type,
+        "metadata": {},
+        "source": source.split("\n"),
+        "outputs": [] if cell_type == "code" else [],
+        **({"execution_count": None} if cell_type == "code" else {}),
+    }
+def _md(text: str) -> dict:
+    return _cell(text, "markdown")
+def generate_merge_notebook(
+    config: MergeConfig,
+    output_model_name: str = "",
+    hf_username: str = "",
+    include_quantize: bool = True,
+    include_deploy: bool = True,
+    quant_types: Optional[list[str]] = None,
+) -> dict:
+    """Generate a complete Colab notebook for merging models.
+    Args:
+        config: MergeConfig with all merge parameters
+        output_model_name: Name for the merged model (e.g., "My-Merged-7B")
+        hf_username: HF username for upload (e.g., "AIencoder")
+        include_quantize: Include GGUF quantization cells
+        include_deploy: Include HF Space deployment cells
+        quant_types: List of quantization types (default: ["Q5_K_M", "Q4_K_M"])
+    Returns:
+        Complete notebook dict (nbformat v4)
+    """
+    if quant_types is None:
+        quant_types = ["Q5_K_M", "Q4_K_M"]
+    if not output_model_name:
+        output_model_name = "ForgeKit-Merged-Model"
+    yaml_config = generate_yaml(config)
+    method_info = MERGE_METHODS.get(config.method, {})
+    # Estimate RAM for Colab runtime recommendation
+    ram_note = ""
+    if config.models:
+        n_models = len(config.models)
+        # Rough heuristic
+        if any("14b" in m.lower() or "13b" in m.lower() for m in config.models):
+            ram_note = "⚠️ 14B models need **High-RAM runtime** (48GB). Go to Runtime → Change runtime → High-RAM."
+        elif any("70b" in m.lower() for m in config.models):
+            ram_note = "⚠️ 70B models need **A100 GPU** (Colab Pro+). This won't work on free tier."
+        elif any("7b" in m.lower() or "8b" in m.lower() for m in config.models):
+            ram_note = "💡 7-8B models work on **High-RAM CPU** runtime (free tier). No GPU needed."
+    cells = []
+    # ===== HEADER =====
+    cells.append(_md(f"""# 🔥 ForgeKit — Model Merge Notebook
+**Generated by [ForgeKit](https://huggingface.co/spaces/AIencoder/ForgeKit)**
+This notebook will:
+1. ✅ Install mergekit and dependencies
+2. ✅ Merge your selected models using **{method_info.get('name', config.method)}**
+3. {'✅' if include_quantize else '⬜'} Quantize to GGUF format
+4. {'✅' if include_deploy else '⬜'} Upload to HuggingFace Hub
+**Models being merged:**
+{chr(10).join(f'- `{m}`' for m in config.models)}
+**Method:** {method_info.get('name', config.method)} — {method_info.get('description', '')}
+{ram_note}
+---
+⚡ **Quick Start:** Click **Runtime → Run all** to execute everything."""))
+    # ===== CELL 1: INSTALL =====
+    cells.append(_md("## 1️⃣ Install Dependencies"))
+    cells.append(_cell("""# Install mergekit and dependencies
+!pip install -q mergekit[all] huggingface_hub transformers accelerate
+!pip install -q pyyaml sentencepiece protobuf
+print("✅ All dependencies installed!")"""))
+    # ===== CELL 2: HF LOGIN =====
+    cells.append(_md("## 2️⃣ HuggingFace Login\nRequired for downloading gated models and uploading your merge."))
+    cells.append(_cell("""from huggingface_hub import notebook_login
+notebook_login()"""))
+    # ===== CELL 3: CONFIG =====
+    cells.append(_md(f"""## 3️⃣ Merge Configuration
+Your merge config (auto-generated by ForgeKit). Edit the YAML below if you want to tweak weights or parameters."""))
+    escaped_yaml = yaml_config.replace('"', '\\"')
+    cells.append(_cell(f"""# === CONFIGURATION ===
+MODEL_NAME = "{output_model_name}"
+USERNAME = "{hf_username}"  # Change to your HF username
+YAML_CONFIG = \"\"\"
+{yaml_config}\"\"\"
+# Display the config
+print("📋 Merge Configuration:")
+print("=" * 50)
+print(YAML_CONFIG)
+print("=" * 50)
+print(f"\\n📦 Output: {{USERNAME}}/{{MODEL_NAME}}" if USERNAME else f"\\n📦 Output: {{MODEL_NAME}}")"""))
+    # ===== CELL 4: MERGE =====
+    cells.append(_md("""## 4️⃣ Execute Merge
+This is the main merge step. Time depends on model sizes:
+| Size | Estimated Time |
+|------|---------------|
+| 1-3B | 5-15 min |
+| 7B | 15-30 min |
+| 14B | 30-60 min |"""))
+    cells.append(_cell("""import yaml
+import os
+import time
+# Write config to file
+with open("merge_config.yaml", "w") as f:
+    f.write(YAML_CONFIG)
+# Create output directory
+os.makedirs("merged_model", exist_ok=True)
+print("🔥 Starting merge...")
+print(f"   Method: {yaml.safe_load(YAML_CONFIG).get('merge_method', 'unknown')}")
+print(f"   Models: {len(yaml.safe_load(YAML_CONFIG).get('models', []))}")
+print()
+start = time.time()
+# Run mergekit
+!mergekit-yaml merge_config.yaml merged_model --copy-tokenizer --allow-crimes --lazy-unpickle
+elapsed = time.time() - start
+print(f"\\n✅ Merge complete in {elapsed/60:.1f} minutes!")
+print(f"📁 Output: ./merged_model/")
+# Show output size
+total = sum(
+    os.path.getsize(os.path.join("merged_model", f))
+    for f in os.listdir("merged_model")
+    if os.path.isfile(os.path.join("merged_model", f))
+)
+print(f"💾 Total size: {total / (1024**3):.2f} GB")"""))
+    # ===== CELL 5: TEST =====
+    cells.append(_md("## 5️⃣ Quick Test\nVerify the merged model loads and generates text."))
+    cells.append(_cell("""from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+print("🧪 Loading merged model for testing...")
+tokenizer = AutoTokenizer.from_pretrained("merged_model", trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    "merged_model",
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True,
+)
+# Test prompts
+test_prompts = [
+    "Write a Python function to calculate fibonacci numbers:",
+    "Explain what machine learning is in simple terms:",
+    "What is 15 * 23 + 7?",
+]
+print("\\n" + "=" * 60)
+for prompt in test_prompts:
+    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+    with torch.no_grad():
+        output = model.generate(
+            **inputs,
+            max_new_tokens=100,
+            do_sample=False,
+            temperature=1.0,
+        )
+    response = tokenizer.decode(output[0], skip_special_tokens=True)
+    print(f"\\n📝 Prompt: {prompt}")
+    print(f"🤖 Response: {response[len(prompt):].strip()[:200]}...")
+    print("-" * 60)
+print("\\n✅ Model test complete!")
+# Clean up GPU memory
+del model
+torch.cuda.empty_cache() if torch.cuda.is_available() else None"""))
+    # ===== CELL 6: UPLOAD =====
+    cells.append(_md("## 6️⃣ Upload to HuggingFace Hub"))
+    model_card = _generate_model_card(config, output_model_name, hf_username)
+    escaped_card = model_card.replace('"""', '\\"\\"\\"')
+    cells.append(_cell(f"""from huggingface_hub import HfApi, create_repo
+REPO_ID = f"{{USERNAME}}/{{MODEL_NAME}}" if USERNAME else MODEL_NAME
+# Create repo
+try:
+    create_repo(REPO_ID, exist_ok=True, repo_type="model")
+    print(f"📦 Repo ready: https://huggingface.co/{{REPO_ID}}")
+except Exception as e:
+    print(f"⚠️ Repo creation: {{e}}")
+# Write model card
+MODEL_CARD = \"\"\"{model_card}\"\"\"
+with open("merged_model/README.md", "w") as f:
+    f.write(MODEL_CARD)
+# Upload
+api = HfApi()
+print("⬆️ Uploading merged model (this may take a while)...")
+api.upload_folder(
+    repo_id=REPO_ID,
+    folder_path="merged_model",
+    commit_message=f"Upload {{MODEL_NAME}} merged with ForgeKit",
+)
+print(f"\\n✅ Model uploaded!")
+print(f"🔗 https://huggingface.co/{{REPO_ID}}")"""))
+    # ===== CELL 7: QUANTIZE (optional) =====
+    if include_quantize:
+        cells.append(_md(f"""## 7️⃣ Quantize to GGUF
+Convert to GGUF format for use with llama.cpp, Ollama, LM Studio, etc.
+**Quantization types:** {', '.join(quant_types)}"""))
+        quant_cmds = "\n".join(
+            f'    !./llama.cpp/llama-quantize model-f16.gguf {output_model_name}-{q}.gguf {q}\n'
+            f'    print(f"✅ {q} done: {output_model_name}-{q}.gguf")'
+            for q in quant_types
+        )
+        cells.append(_cell(f"""import os
+print("📦 Setting up llama.cpp for GGUF conversion...")
+# Clone and build llama.cpp
+if not os.path.exists("llama.cpp"):
+    !git clone --depth 1 https://github.com/ggerganov/llama.cpp
+    !cd llama.cpp && make -j$(nproc) llama-quantize
+# Install conversion deps
+!pip install -q gguf
+# Convert to f16 GGUF first
+print("\\n🔄 Converting to GGUF (f16)...")
+!python llama.cpp/convert_hf_to_gguf.py merged_model --outfile model-f16.gguf --outtype f16
+# Quantize to each target
+print("\\n🗜️ Quantizing...")
+if os.path.exists("model-f16.gguf"):
+{quant_cmds}
+    # Show file sizes
+    print("\\n📊 Output sizes:")
+    for f in os.listdir("."):
+        if f.endswith(".gguf"):
+            size_gb = os.path.getsize(f) / (1024**3)
+            print(f"   {{f}}: {{size_gb:.2f}} GB")
+else:
+    print("❌ f16 conversion failed. Check errors above.")"""))
+        # Upload GGUFs
+        cells.append(_cell(f"""# Upload GGUF files to the same repo
+import os
+from huggingface_hub import HfApi
+api = HfApi()
+REPO_ID = f"{{USERNAME}}/{{MODEL_NAME}}" if USERNAME else MODEL_NAME
+gguf_files = [f for f in os.listdir(".") if f.endswith(".gguf") and f != "model-f16.gguf"]
+for gf in gguf_files:
+    print(f"⬆️ Uploading {{gf}}...")
+    api.upload_file(
+        path_or_fileobj=gf,
+        path_in_repo=gf,
+        repo_id=REPO_ID,
+    )
+    print(f"   ✅ Done")
+print(f"\\n🎉 All GGUF files uploaded to https://huggingface.co/{{REPO_ID}}")"""))
+    # ===== CELL 8: DEPLOY (optional) =====
+    if include_deploy:
+        cells.append(_md("""## 8️⃣ Deploy to HuggingFace Space
+Create a Gradio chat Space running your merged model."""))
+        cells.append(_cell(f"""from huggingface_hub import HfApi, create_repo
+SPACE_ID = f"{{USERNAME}}/{{MODEL_NAME}}-chat" if USERNAME else f"{{MODEL_NAME}}-chat"
+REPO_ID = f"{{USERNAME}}/{{MODEL_NAME}}" if USERNAME else MODEL_NAME
+# Create Space
+try:
+    create_repo(SPACE_ID, repo_type="space", space_sdk="gradio", exist_ok=True)
+    print(f"🚀 Space created: https://huggingface.co/spaces/{{SPACE_ID}}")
+except Exception as e:
+    print(f"⚠️ {{e}}")
+# Generate app.py
+APP_CODE = '''import gradio as gr
+from transformers import AutoTokenizer, AutoModelForCausalLM, TextIteratorStreamer
+import torch
+from threading import Thread
+MODEL_ID = "{hf_username}/{output_model_name}" if "{hf_username}" else "{output_model_name}"
+tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    MODEL_ID, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
+)
+def chat(message, history):
+    messages = []
+    for h in history:
+        messages.append({{"role": "user", "content": h[0]}})
+        if h[1]:
+            messages.append({{"role": "assistant", "content": h[1]}})
+    messages.append({{"role": "user", "content": message}})
+    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+    inputs = tokenizer(text, return_tensors="pt").to(model.device)
+    streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
+    thread = Thread(target=model.generate, kwargs={{
+        **inputs, "max_new_tokens": 512, "streamer": streamer, "do_sample": True, "temperature": 0.7
+    }})
+    thread.start()
+    response = ""
+    for token in streamer:
+        response += token
+        yield response
+demo = gr.ChatInterface(chat, title="🔥 {output_model_name}", description="Merged with ForgeKit")
+demo.launch()
+'''
+api = HfApi()
+# Upload app.py
+api.upload_file(
+    path_or_fileobj=APP_CODE.encode(),
+    path_in_repo="app.py",
+    repo_id=SPACE_ID,
+    repo_type="space",
+)
+# Upload requirements.txt
+reqs = "transformers\\ntorch\\naccelerate\\nsentencepiece\\nprotobuf"
+api.upload_file(
+    path_or_fileobj=reqs.encode(),
+    path_in_repo="requirements.txt",
+    repo_id=SPACE_ID,
+    repo_type="space",
+)
+print(f"\\n🎉 Space deployed!")
+print(f"🔗 https://huggingface.co/spaces/{{SPACE_ID}}")
+print(f"\\n⏳ It may take a few minutes to build and start.")"""))
+    # ===== DONE =====
+    cells.append(_md(f"""## 🎉 All Done!
+Your merged model **{output_model_name}** is ready. Here's what was created:
+| Output | Link |
+|--------|------|
+| Model | `https://huggingface.co/{hf_username or 'YOUR_USERNAME'}/{output_model_name}` |
+{'| GGUF Files | Same repo (quantized versions) |' if include_quantize else ''}
+{'| Chat Space | `https://huggingface.co/spaces/' + (hf_username or 'YOUR_USERNAME') + '/' + output_model_name + '-chat` |' if include_deploy else ''}
+---
+**Made with [ForgeKit](https://huggingface.co/spaces/AIencoder/ForgeKit)** — Forge your perfect AI model 🔥"""))
+    # ===== BUILD NOTEBOOK =====
+    notebook = {
+        "nbformat": 4,
+        "nbformat_minor": 5,
+        "metadata": {
+            "kernelspec": {
+                "display_name": "Python 3",
+                "language": "python",
+                "name": "python3",
+            },
+            "language_info": {"name": "python", "version": "3.10.0"},
+            "colab": {
+                "provenance": [],
+                "gpuType": "T4",
+            },
+            "accelerator": "GPU",
+        },
+        "cells": cells,
+    }
+    return notebook
+def _generate_model_card(config: MergeConfig, name: str, username: str) -> str:
+    """Generate a model card README.md for the merged model."""
+    method_info = MERGE_METHODS.get(config.method, {})
+    models_list = "\n".join(f"- [{m}](https://huggingface.co/{m})" for m in config.models)
+    base_link = f"[{config.base_model}](https://huggingface.co/{config.base_model})" if config.base_model else "N/A"
+    return f"""---
+tags:
+- merge
+- mergekit
+- forgekit
+base_model: {config.base_model or config.models[0] if config.models else ''}
+license: apache-2.0
+---
+# {name}
+This model was created using **[ForgeKit](https://huggingface.co/spaces/AIencoder/ForgeKit)** — an open-source model merging platform.
+## Merge Details
+| Parameter | Value |
+|-----------|-------|
+| **Method** | {method_info.get('name', config.method)} |
+| **Base Model** | {base_link} |
+| **dtype** | {config.dtype} |
+### Source Models
+{models_list}
+### Configuration
+```yaml
+{generate_yaml(config)}
+```
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("{username}/{name}" if "{username}" else "{name}")
+model = AutoModelForCausalLM.from_pretrained("{username}/{name}" if "{username}" else "{name}")
+```
+---
+*Made with [ForgeKit](https://huggingface.co/spaces/AIencoder/ForgeKit)* 🔥
+"""
+def notebook_to_json(notebook: dict) -> str:
+    """Serialize notebook to JSON string."""
+    return json.dumps(notebook, indent=2, ensure_ascii=False)
+def save_notebook(notebook: dict, path: str):
+    """Save notebook to .ipynb file."""
+    with open(path, "w", encoding="utf-8") as f:
+        json.dump(notebook, f, indent=2, ensure_ascii=False)

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+gradio>=4.0.0
+huggingface_hub>=0.20.0
+requests>=2.28.0
+pyyaml>=6.0
+nbformat>=5.7.0