Instructions to use cmndcntrlcyber/qwen14b-code-trainer-v6-aggressive-full3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use cmndcntrlcyber/qwen14b-code-trainer-v6-aggressive-full3 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-14B-Instruct") model = PeftModel.from_pretrained(base_model, "cmndcntrlcyber/qwen14b-code-trainer-v6-aggressive-full3") - Notebooks
- Google Colab
- Kaggle
qwen14b-code-trainer-v6-aggressive-full3
LoRA adapter for Qwen/Qwen2.5-Coder-14B-Instruct, trained for 3 epochs
over an 8 K-row slice of the
code-trainer-offsec-dataset
as the Phase 4B follow-up to the Phase 4A aggressive sweep winner.
Status: kept on the Hub for transparency, but not the canonical Phase 5 conversion target. The 1-epoch / full-data sibling
qwen14b-code-trainer-v6-aggressiveoutperformed this adapter on the full validation split (eval_loss 0.4724 vs 0.5126) โ see the comparison below.
Intended use
Same as the Phase 4A sibling โ instruction-following code generation across the 8 dataset languages. Use this adapter only if you specifically want to study the 3-epoch / sliced-data behaviour.
Training data
- Dataset:
cmndcntrlcyber/code-trainer-offsec-dataset(revisionmain, text-only chat format). - Splits seen: 8,000 train rows (
--train-limit 8000slice of 26,126), 500 val rows (--val-limit 500slice of 3,265). - Total samples seen: 24,000 (3 epochs ร 8 K). The Phase 4A run saw 26,126 samples in 1 epoch.
Training procedure
| Knob | Value |
|---|---|
| Base model | Qwen/Qwen2.5-Coder-14B-Instruct |
| Adapter | LoRA (PEFT), r = 64, alpha = 128, dropout = 0.05 |
| Learning rate | 3e-4 (cosine decay, warmup ratio 0.03) |
| Batch size ร grad accum | 4 ร 4 (effective batch = 16) |
| Epochs | 3 |
| Sequence length | 2,048 |
| Precision | bfloat16 + gradient checkpointing |
| Hardware | HF Skills a100-large |
| Frameworks | transformers, peft, trl (SFTTrainer) |
| Job runtime | 4 h 53 m (COMPLETED) |
| HF Job | 69f8188e9d85bec4d76f1c5e |
Evaluation โ Phase 4A vs Phase 4B (full val 3,265 rows)
| Metric | Phase 4A aggressive |
Phase 4B aggressive-full3 (this) |
|---|---|---|
| eval_loss (full val) | 0.4724 | 0.5126 |
| eval_loss (500-row slice during training) | โ | 0.5102 |
| Total samples seen | 26,126 | 24,000 |
| Epochs | 1 | 3 |
The 500-row training-time eval already pointed in this direction; the full-val eval confirmed it. Conclusion: more unique examples beats more passes for this dataset and this base model. The 3-epoch slice produced a measurably weaker adapter despite seeing roughly the same compute budget.
See docs/sweep/phase4b-summary.md
for the apples-to-apples writeup.
Limitations
- Worse than
qwen14b-code-trainer-v6-aggressiveon the same val split โ use that adapter unless you need to reproduce this experiment. - All other limitations from the Phase 4A card carry over (no safety tuning, multilingual eval approximate, adapter-only).
How to use
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_id = "Qwen/Qwen2.5-Coder-14B-Instruct"
adapter_id = "cmndcntrlcyber/qwen14b-code-trainer-v6-aggressive-full3"
tokenizer = AutoTokenizer.from_pretrained(base_id)
model = AutoModelForCausalLM.from_pretrained(
base_id, dtype=torch.bfloat16, device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()
Reproducibility
- Code: github.com/cmndcntrlcyber/code-trainer-offsec-pipeline
- Launch command:
python -m src.phase4_qwen_finetuning.scripts.launch_full_training \ --config src/config/v6_config.yaml --best-config aggressive \ --train-limit 8000 --val-limit 500 --wait - Cost: ~$15.60 on
a100-large(4 h 53 m).
- Downloads last month
- 40
Model tree for cmndcntrlcyber/qwen14b-code-trainer-v6-aggressive-full3
Base model
Qwen/Qwen2.5-14B