Assignment 3 Forward LoRA Adapter

This repository contains the Part 4 final instruction-tuning adapter for Assignment 3, built on top of Qwen/Qwen3-1.7B with LoRA. The model is intended as a compact, submission-ready artifact for the classroom pipeline based on self-generated and self-curated instruction-response pairs.

Model Details

  • Developed by: sunming-giegie
  • Model type: Causal language model with LoRA adapter
  • Base model: Qwen/Qwen3-1.7B
  • Language: English
  • License: Apache-2.0 for the base model; adapter release follows course-project use
  • Finetuning method: PEFT LoRA

Intended Use

This adapter is meant for:

  • course assignment demonstration
  • lightweight instruction-following experiments
  • reproducing the final SFT stage of the assignment pipeline

It is not intended as a production-ready general assistant.

Limitations

This model was trained on a small curated dataset and remains noticeably sensitive to prompt style and topic domain. In internal inspection during the assignment, the model was more reliable on concise factual or technical questions than on creative, open-ended, or multi-step reasoning prompts.

Known limitations:

  • may generate generic or over-explanatory answers
  • may fail on broad open-ended prompts
  • may still underperform the base model on difficult reasoning tasks
  • evaluation here is qualitative and assignment-oriented, not benchmark-complete

How to Use

Load the base model first, then attach this adapter with PEFT.

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = "Qwen/Qwen3-1.7B"
adapter_path = "sunming-giegie/assignment3-part4-qwen3-1.7b-lora"

tokenizer = AutoTokenizer.from_pretrained(adapter_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_path)

Training Data

The training data comes from the Part 3 curated dataset:

  1. Sample 150 single-turn LIMA responses.
  2. Use a backward model to infer instructions from responses.
  3. Score each generated (instruction, response) pair with Qwen/Qwen3-1.7B.
  4. Keep higher-quality pairs for forward supervised fine-tuning.
  5. Build a compact subset emphasizing cleaner, shorter, and more technical examples.

The Part 3 curated dataset repo is expected to live alongside this model release.

Training Procedure

This adapter corresponds to the compact forward model variant selected as the most submission-ready version after multiple remediation rounds.

Hyperparameters

  • Precision: bf16
  • LoRA rank (r): 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Target modules: q_proj, k_proj, v_proj, o_proj
  • Learning rate: 5e-5
  • Epochs: 4
  • Per-device batch size: 2
  • Gradient accumulation: 8
  • Max sequence length: 1536

Prompting

Training and inference used a direct-answer prompt style with a /no_think control token and explicit constraints to avoid chain-of-thought style output. Additional response cleaning and token suppression were added during the assignment to reduce stray special-token leakage.

Evaluation

Evaluation for the assignment was primarily example-based and qualitative:

  • generate held-out sample responses
  • compare fluency, relevance, and format cleanliness
  • prefer the checkpoint that minimizes prompt leakage and malformed outputs

Among the explored variants, the compact adapter was selected for submission because it produced the most stable direct answers on short factual and technical prompts.

Files

  • adapter_model.safetensors: LoRA adapter weights
  • adapter_config.json: LoRA configuration
  • tokenizer files copied for easier loading

Framework Versions

  • PEFT 0.18.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sunming-giegie/assignment3-part4-qwen3-1.7b-lora

Finetuned
Qwen/Qwen3-1.7B
Adapter
(511)
this model