Qwen3-4B Code SFT - No-Think Baseline (LoRA Merged)

LoRA supervised fine-tuning of Qwen/Qwen3-4B-Base, with adapters merged into full weights for direct inference.

This repo is LoRA-based SFT (rank 64), not full-parameter fine-tuning.

Model Details

  • Base model: Qwen/Qwen3-4B-Base
  • Fine-tuning: LoRA SFT (rank 64, alpha 128), merged into full weights
  • Mode: No-think (enable_thinking=false)
  • Training cutoff length: 8192 tokens

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "modrill/qwen3-4b-nothink-baseline-lora-sft"
tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id, trust_remote_code=True, torch_dtype="auto", device_map="auto"
)

Inference Tips

  • Set enable_thinking=false in chat template
  • Recommended max_tokens: 8192

License

Apache 2.0, consistent with the Qwen3 base model license.

Downloads last month
35
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for modrill/qwen3-4b-nothink-baseline-lora-sft

Adapter
(61)
this model