NNEngine
/

qwen2-0.5b-python-lora

Model card Files Files and versions

NNEngine commited on 9 days ago

Commit

7229188

·

verified ·

1 Parent(s): 6b7c84e

Update README.md

Files changed (1) hide show

README.md +153 -3

README.md CHANGED Viewed

@@ -1,3 +1,153 @@
----
-license: mit
----

+---
+license: mit
+---
+# Model Card
+# Qwen2-0.5B-Python-SFT (LoRA)
+## Overview
+This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks.
+The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment.
+This repository contains **LoRA adapter weights**, not the full base model.
+## Base Model
+* Base: `Qwen/Qwen2-0.5B`
+* Architecture: Decoder-only Transformer
+* Parameters: 0.5B
+* License: Refer to original Qwen license
+Base model must be loaded separately.
+## Training Dataset
+* Dataset: `iamtarun/python_code_instructions_18k_alpaca`
+* Size: ~18,000 instruction-output pairs
+* Format: Alpaca-style instruction → response
+* Domain: Python programming tasks
+Each training sample followed:
+```
+Below is an instruction that describes a task.
+Write a response that appropriately completes the request.
+### Instruction:
+...
+### Response:
+...
+```
+## Training Details
+* Method: QLoRA (4-bit)
+* Quantization: NF4
+* Compute dtype: FP16
+* Optimizer: paged_adamw_8bit
+* Sequence length: 384–512
+* Epochs: 1
+* Final training loss: ~0.2–0.3
+* Hardware: Tesla P100 (16GB)
+* Frameworks:
+  * transformers
+  * peft
+  * trl
+  * bitsandbytes
+## Intended Use
+This model is designed for:
+* Python code generation
+* Simple algorithm implementation
+* Educational coding tasks
+* Instruction-following code responses
+It performs best when prompted in Alpaca-style format:
+```
+Below is an instruction that describes a task.
+### Instruction:
+Write a Python function to reverse a linked list.
+### Response:
+```
+## How to Use
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B")
+tokenizer = AutoTokenizer.from_pretrained("your-username/qwen2-0.5b-python-lora")
+model = PeftModel.from_pretrained(base_model, "your-username/qwen2-0.5b-python-lora")
+model.eval()
+```
+Example generation:
+```python
+prompt = """Below is an instruction that describes a task.
+### Instruction:
+Write a Python function to check if a number is prime.
+### Response:
+"""
+```
+## Observed Behavior
+The model demonstrates:
+* Improved Python code structuring
+* Better adherence to instruction-response formatting
+* Faster convergence for common programming tasks
+Limitations:
+* Small model size (0.5B) limits reasoning depth
+* May hallucinate under high-temperature decoding
+* Works best with explicit language specification ("Write a Python function")
+## Limitations
+* Not suitable for production-critical systems
+* Limited mathematical and multi-step reasoning capability
+* Sensitive to prompt formatting
+* Performance depends heavily on decoding strategy
+## Future Improvements
+Potential enhancements:
+* Mask instruction tokens during SFT
+* Increase model size (1.5B+)
+* Train on more diverse programming datasets
+* Evaluate with pass@k benchmarks
+## Acknowledgements
+* Base model by Qwen team
+* Dataset by `iamtarun`