Starred09 commited on
Commit ·
68ada46
1
Parent(s): 05e5a00
Clean model card for public Graphite 1.0 4B release
Browse files
README.md
CHANGED
|
@@ -33,8 +33,6 @@ tags:
|
|
| 33 |
- logic and factual precision
|
| 34 |
- bilingual Russian / English instruction following
|
| 35 |
|
| 36 |
-
This repository keeps the legacy slug `obsidian-critic-qwen35-4b-base-lora` because that was the original public upload target, but the public model name for documentation and grant material is **Graphite 1.0 4B**.
|
| 37 |
-
|
| 38 |
## What This Repository Contains
|
| 39 |
|
| 40 |
This repo contains a **LoRA adapter**, not merged base weights.
|
|
@@ -57,12 +55,11 @@ Files of interest:
|
|
| 57 |
|
| 58 |
## Training Lineage
|
| 59 |
|
| 60 |
-
This adapter corresponds to the **first public
|
| 61 |
|
| 62 |
-
-
|
| 63 |
-
-
|
| 64 |
-
-
|
| 65 |
-
- later `Graphite 1.1` experiments are intentionally excluded from this card
|
| 66 |
|
| 67 |
Notebook lineage used for this stream:
|
| 68 |
|
|
@@ -74,7 +71,6 @@ Notebook lineage used for this stream:
|
|
| 74 |
The training data for this first public stream comes from the mixed dataset:
|
| 75 |
|
| 76 |
- dataset name: `obsidian-critic-broad-mix-20260321`
|
| 77 |
-
- local source dir: `/home/starred/datasets/obsidian-critic-broad-mix-20260321`
|
| 78 |
- examples in mixed dataset: `37,008`
|
| 79 |
- approximate token volume: `6,885,960`
|
| 80 |
- exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
|
|
@@ -111,33 +107,42 @@ The public training run then created a deterministic train / validation split an
|
|
| 111 |
| `wave_backfill` | 230 | 218,922 |
|
| 112 |
| `long_context` | 24 | 350,954 |
|
| 113 |
|
| 114 |
-
### Source
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 141 |
|
| 142 |
## Representative Training Examples
|
| 143 |
|
|
@@ -235,9 +240,8 @@ The first patch hit the wrong seam. The new signal points back to `app/config.py
|
|
| 235 |
|
| 236 |
## Training Recipe
|
| 237 |
|
| 238 |
-
The public run
|
| 239 |
|
| 240 |
-
- hardware: **Kaggle dual T4**
|
| 241 |
- distributed setup: **`torchrun` DDP**
|
| 242 |
- training framework: **Unsloth + TRL**
|
| 243 |
- base model loading: **4-bit**
|
|
@@ -260,11 +264,6 @@ The public run in this repo used:
|
|
| 260 |
- public run total steps: **2256**
|
| 261 |
- logging / eval / save cadence: **50 / 125 / 250**
|
| 262 |
|
| 263 |
-
Best public checkpoint recorded in `trainer_state.json`:
|
| 264 |
-
|
| 265 |
-
- best checkpoint: `checkpoint-2250`
|
| 266 |
-
- best metric: `0.18876151740550995`
|
| 267 |
-
|
| 268 |
## Prompt Style
|
| 269 |
|
| 270 |
This adapter was trained on a simple, explicit prompt layout:
|
|
@@ -320,7 +319,7 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
|
|
| 320 |
|
| 321 |
Graphite 1.0 4B is intended for:
|
| 322 |
|
| 323 |
-
-
|
| 324 |
- repo triage and patch-planning copilots
|
| 325 |
- Markdown / docs tooling assistants
|
| 326 |
- logic and wording critique
|
|
@@ -334,7 +333,6 @@ It is especially useful when you want **short, grounded, non-theatrical outputs*
|
|
| 334 |
- It is tuned for **structured technical work**, not general consumer chat.
|
| 335 |
- It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
|
| 336 |
- The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.
|
| 337 |
-
- This card documents the **first public stream only**. Later `Graphite 1.1` experiments are intentionally excluded.
|
| 338 |
|
| 339 |
## License
|
| 340 |
|
|
@@ -349,4 +347,3 @@ Please also review the license and usage terms of the base model:
|
|
| 349 |
- Alibaba Qwen team for the base model
|
| 350 |
- Unsloth for the efficient LoRA training stack
|
| 351 |
- TRL / Transformers / PEFT / PyTorch maintainers
|
| 352 |
-
- Kaggle dual-T4 environment used for the public training run
|
|
|
|
| 33 |
- logic and factual precision
|
| 34 |
- bilingual Russian / English instruction following
|
| 35 |
|
|
|
|
|
|
|
| 36 |
## What This Repository Contains
|
| 37 |
|
| 38 |
This repo contains a **LoRA adapter**, not merged base weights.
|
|
|
|
| 55 |
|
| 56 |
## Training Lineage
|
| 57 |
|
| 58 |
+
This adapter corresponds to the **first public Graphite 1.0 4B full fine-tune stream**.
|
| 59 |
|
| 60 |
+
- dataset family: **`obsidian-critic-broad-mix-20260321`**
|
| 61 |
+
- training stack: **Unsloth + TRL + torchrun DDP**
|
| 62 |
+
- base model: **`Qwen/Qwen3.5-4B-Base`**
|
|
|
|
| 63 |
|
| 64 |
Notebook lineage used for this stream:
|
| 65 |
|
|
|
|
| 71 |
The training data for this first public stream comes from the mixed dataset:
|
| 72 |
|
| 73 |
- dataset name: `obsidian-critic-broad-mix-20260321`
|
|
|
|
| 74 |
- examples in mixed dataset: `37,008`
|
| 75 |
- approximate token volume: `6,885,960`
|
| 76 |
- exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
|
|
|
|
| 107 |
| `wave_backfill` | 230 | 218,922 |
|
| 108 |
| `long_context` | 24 | 350,954 |
|
| 109 |
|
| 110 |
+
### Source Dataset Table
|
| 111 |
+
|
| 112 |
+
| Dataset | Role | Examples | Approx. tokens |
|
| 113 |
+
| --- | --- | ---: | ---: |
|
| 114 |
+
| `real-world-grounded-topup-sft-20260320` | `core_real` | 3,000 | 804,280 |
|
| 115 |
+
| `robustness-noise-traps-sft-20260320` | `robustness` | 3,200 | 363,139 |
|
| 116 |
+
| `factual-erudition-sft-20260319` | `factual` | 2,960 | 142,787 |
|
| 117 |
+
| `agent-gap-fixes-sft-20260320` | `agent_core` | 2,500 | 555,785 |
|
| 118 |
+
| `code-fix-critical-topup-sft-20260321` | `repair` | 2,500 | 440,355 |
|
| 119 |
+
| `code-agent-tooluse-sft-20260319` | `tool_use` | 2,400 | 208,335 |
|
| 120 |
+
| `docs-engineering-review-topup-sft-20260320` | `obsidian_docs` | 1,600 | 206,794 |
|
| 121 |
+
| `format-tool-discipline-sft-20260319` | `tool_use` | 1,582 | 179,009 |
|
| 122 |
+
| `multi-step-debug-sft-20260319` | `reasoning` | 1,200 | 345,084 |
|
| 123 |
+
| `real-world-seed-expansion-sft-20260321` | `core_real` | 1,200 | 238,907 |
|
| 124 |
+
| `runtime-debug-grounded-sft-20260319` | `repair` | 1,193 | 218,845 |
|
| 125 |
+
| `logic-core-sft-20260319` | `logic` | 1,131 | 139,188 |
|
| 126 |
+
| `code-architecture-sft-20260319` | `greenfield` | 1,100 | 308,380 |
|
| 127 |
+
| `tdd-test-first-sft-20260319` | `reasoning` | 1,000 | 310,540 |
|
| 128 |
+
| `logic-sanity-sft-20260319` | `logic` | 996 | 69,260 |
|
| 129 |
+
| `code-repair-patch-sft-20260319` | `repair` | 955 | 145,544 |
|
| 130 |
+
| `logic-precision-ru-sft-20260319` | `logic` | 904 | 89,364 |
|
| 131 |
+
| `security-repair-review-sft-20260319` | `review` | 893 | 159,086 |
|
| 132 |
+
| `db-and-migrations-sft-20260319` | `integration` | 867 | 119,740 |
|
| 133 |
+
| `agent-gap-fixes-ru-topup-sft-20260320` | `agent_core` | 700 | 89,641 |
|
| 134 |
+
| `code-agent-tooluse-ru-topup-sft-20260320` | `tool_use` | 700 | 68,256 |
|
| 135 |
+
| `multi-file-repo-repair-sft-20260319` | `repair` | 705 | 179,146 |
|
| 136 |
+
| `backend-frontend-ops-sft-20260319` | `integration` | 606 | 121,591 |
|
| 137 |
+
| `docs-topup-sft-20260320` | `obsidian_docs` | 600 | 125,125 |
|
| 138 |
+
| `anti-overthinking-pack-sft-20260321` | `regularizer` | 500 | 55,840 |
|
| 139 |
+
| `docs-markdown-sft-20260318-v3` | `obsidian_docs` | 440 | 149,852 |
|
| 140 |
+
| `ts-rust-code-review-sft-20260318-v3` | `review` | 434 | 183,921 |
|
| 141 |
+
| `robustness-noise-traps-ru-topup-sft-20260320` | `robustness` | 400 | 34,260 |
|
| 142 |
+
| `ts-rust-coding-sft-20260318-v3` | `greenfield` | 388 | 254,951 |
|
| 143 |
+
| `wave-03-growth-sft-20260320` | `wave_backfill` | 230 | 218,922 |
|
| 144 |
+
| `docs-topup-ru-sft-20260320` | `obsidian_docs` | 100 | 9,079 |
|
| 145 |
+
| `long-context-memory-topup-sft-20260321` | `long_context` | 24 | 350,954 |
|
| 146 |
|
| 147 |
## Representative Training Examples
|
| 148 |
|
|
|
|
| 240 |
|
| 241 |
## Training Recipe
|
| 242 |
|
| 243 |
+
The public run used:
|
| 244 |
|
|
|
|
| 245 |
- distributed setup: **`torchrun` DDP**
|
| 246 |
- training framework: **Unsloth + TRL**
|
| 247 |
- base model loading: **4-bit**
|
|
|
|
| 264 |
- public run total steps: **2256**
|
| 265 |
- logging / eval / save cadence: **50 / 125 / 250**
|
| 266 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 267 |
## Prompt Style
|
| 268 |
|
| 269 |
This adapter was trained on a simple, explicit prompt layout:
|
|
|
|
| 319 |
|
| 320 |
Graphite 1.0 4B is intended for:
|
| 321 |
|
| 322 |
+
- coding assistants
|
| 323 |
- repo triage and patch-planning copilots
|
| 324 |
- Markdown / docs tooling assistants
|
| 325 |
- logic and wording critique
|
|
|
|
| 333 |
- It is tuned for **structured technical work**, not general consumer chat.
|
| 334 |
- It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
|
| 335 |
- The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.
|
|
|
|
| 336 |
|
| 337 |
## License
|
| 338 |
|
|
|
|
| 347 |
- Alibaba Qwen team for the base model
|
| 348 |
- Unsloth for the efficient LoRA training stack
|
| 349 |
- TRL / Transformers / PEFT / PyTorch maintainers
|
|
|