Instructions to use JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-OpenCodeInstruct-Learning-LoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-OpenCodeInstruct-Learning-LoRA with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("F:\IA\heretic\F") model = PeftModel.from_pretrained(base_model, "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-OpenCodeInstruct-Learning-LoRA") - Notebooks
- Google Colab
- Kaggle
Qwen3 4B Thinking 2507 Heretic CodeFeedback — OpenCodeInstruct Learning LoRA
This repository contains an experimental LoRA adapter trained on top of:
JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback
This adapter was trained as an additional OpenCodeInstruct continuation experiment.
It is not recommended as the main agentic coding version.
The main observation from testing is that this adapter appears to be more useful for learning, explanation, reasoning through code problems, and understanding programming tasks, but it became worse at strict “return only executable code†benchmark tasks.
Intended purpose
This LoRA is kept and published as an experimental branch for:
- code explanation
- learning-oriented coding assistance
- understanding programming problems
- step-by-step reasoning around code
- comparing OpenCodeInstruct-style behavior against a stricter code-output model
It is not ideal for:
- agentic coding
- test-driven code generation
- benchmark-style exact function output
- tools that require the model to return only executable code
- coding agents that must avoid prose/explanation unless asked
Why this is not the main version
A small local before/after Python code benchmark showed that this OpenCodeInstruct continuation reduced exact-code benchmark performance.
| Model | Adapter | Passed | Pass rate | Avg tokens/s |
|---|---|---|---|---|
| Before | heretic_F_lora_python5000_codefeedback5000 |
9/10 | 90.00% | 8.38 |
| After | SAFE_OPENCODE_5000_1024_20260607_153327 |
6/10 | 60.00% | 8.41 |
Delta:
| Metric | Value |
|---|---|
| Passes | -3 |
| Pass rate | -30.00% |
| Avg tokens/s | +0.03 |
The post-training adapter was worse on strict executable-code tasks, especially when the expected output was a compact Python function or class.
However, this does not mean the adapter is useless. It likely shifted behavior toward a more explanatory, learning-oriented style. That can be useful for users who want to understand code, reason through tasks, or receive more guided programming explanations.
Training configuration
| Item | Value |
|---|---|
| Base model | JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback |
| Adapter input | heretic_F_lora_python5000_codefeedback5000 |
| Dataset | nvidia/OpenCodeInstruct |
| Samples used | 5,000 |
| Sequence length | 1024 |
| Epochs | 1 |
| Learning rate | 5e-6 |
| Training method | QLoRA / LoRA |
| Quantized loading during training | 4-bit NF4 |
Training result
| Metric | Value |
|---|---|
| Train runtime | 6258 seconds |
| Runtime | 1h 44m 18s |
| Samples/second | 0.799 |
| Steps/second | 0.1 |
| Final train loss | 0.3913 |
| First logged loss | 0.6957 |
| Last logged loss | 0.3623 |
| Minimum logged loss | 0.3441 |
The training run completed successfully after reducing sequence length and using a more conservative GPU configuration.
Benchmark files
The local benchmark artifacts are included in this repository under:
benchmark/
Files:
benchmark/before_summary.md
benchmark/after_summary.md
benchmark/COMPARISON.md
benchmark/comparison.json
benchmark/before_results.jsonl
benchmark/after_results.jsonl
Recommended usage
Use this adapter when you want a model that may be more comfortable explaining code and reasoning through programming tasks.
For stricter agentic coding or benchmark-style executable output, prefer the original merged CodeFeedback model:
JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback
Loading example
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
base_model = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback"
adapter = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-OpenCodeInstruct-Learning-LoRA"
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()
Important notes
This is an experimental LoRA adapter.
It should not be treated as a universal improvement over the previous CodeFeedback model. It is published for transparency, comparison, and reproducibility.
The benchmark results suggest that it is worse for strict agentic coding, but potentially useful for learning-oriented coding assistance.
- Downloads last month
- 14
Model tree for JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-OpenCodeInstruct-Learning-LoRA
Base model
Qwen/Qwen3-4B-Thinking-2507