NNEngine commited on
Commit
7229188
·
verified ·
1 Parent(s): 6b7c84e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +153 -3
README.md CHANGED
@@ -1,3 +1,153 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # Model Card
6
+
7
+
8
+ # Qwen2-0.5B-Python-SFT (LoRA)
9
+
10
+ ## Overview
11
+
12
+ This model is a Supervised Fine-Tuned (SFT) version of **Qwen/Qwen2-0.5B**, adapted for Python instruction-following tasks.
13
+
14
+ The fine-tuning was performed using QLoRA (4-bit quantization + LoRA adapters) on a curated Python instruction dataset to improve structured code generation and instruction alignment.
15
+
16
+ This repository contains **LoRA adapter weights**, not the full base model.
17
+
18
+
19
+ ## Base Model
20
+
21
+ * Base: `Qwen/Qwen2-0.5B`
22
+ * Architecture: Decoder-only Transformer
23
+ * Parameters: 0.5B
24
+ * License: Refer to original Qwen license
25
+
26
+ Base model must be loaded separately.
27
+
28
+
29
+ ## Training Dataset
30
+
31
+ * Dataset: `iamtarun/python_code_instructions_18k_alpaca`
32
+ * Size: ~18,000 instruction-output pairs
33
+ * Format: Alpaca-style instruction → response
34
+ * Domain: Python programming tasks
35
+
36
+ Each training sample followed:
37
+
38
+ ```
39
+ Below is an instruction that describes a task.
40
+ Write a response that appropriately completes the request.
41
+
42
+ ### Instruction:
43
+ ...
44
+
45
+ ### Response:
46
+ ...
47
+ ```
48
+
49
+
50
+ ## Training Details
51
+
52
+ * Method: QLoRA (4-bit)
53
+ * Quantization: NF4
54
+ * Compute dtype: FP16
55
+ * Optimizer: paged_adamw_8bit
56
+ * Sequence length: 384–512
57
+ * Epochs: 1
58
+ * Final training loss: ~0.2–0.3
59
+ * Hardware: Tesla P100 (16GB)
60
+ * Frameworks:
61
+
62
+ * transformers
63
+ * peft
64
+ * trl
65
+ * bitsandbytes
66
+
67
+
68
+ ## Intended Use
69
+
70
+ This model is designed for:
71
+
72
+ * Python code generation
73
+ * Simple algorithm implementation
74
+ * Educational coding tasks
75
+ * Instruction-following code responses
76
+
77
+ It performs best when prompted in Alpaca-style format:
78
+
79
+ ```
80
+ Below is an instruction that describes a task.
81
+
82
+ ### Instruction:
83
+ Write a Python function to reverse a linked list.
84
+
85
+ ### Response:
86
+ ```
87
+
88
+
89
+ ## How to Use
90
+
91
+ ```python
92
+ import torch
93
+ from transformers import AutoTokenizer, AutoModelForCausalLM
94
+ from peft import PeftModel
95
+
96
+ base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B")
97
+ tokenizer = AutoTokenizer.from_pretrained("your-username/qwen2-0.5b-python-lora")
98
+
99
+ model = PeftModel.from_pretrained(base_model, "your-username/qwen2-0.5b-python-lora")
100
+
101
+ model.eval()
102
+ ```
103
+
104
+ Example generation:
105
+
106
+ ```python
107
+ prompt = """Below is an instruction that describes a task.
108
+
109
+ ### Instruction:
110
+ Write a Python function to check if a number is prime.
111
+
112
+ ### Response:
113
+ """
114
+ ```
115
+
116
+
117
+ ## Observed Behavior
118
+
119
+ The model demonstrates:
120
+
121
+ * Improved Python code structuring
122
+ * Better adherence to instruction-response formatting
123
+ * Faster convergence for common programming tasks
124
+
125
+ Limitations:
126
+
127
+ * Small model size (0.5B) limits reasoning depth
128
+ * May hallucinate under high-temperature decoding
129
+ * Works best with explicit language specification ("Write a Python function")
130
+
131
+
132
+ ## Limitations
133
+
134
+ * Not suitable for production-critical systems
135
+ * Limited mathematical and multi-step reasoning capability
136
+ * Sensitive to prompt formatting
137
+ * Performance depends heavily on decoding strategy
138
+
139
+ ## Future Improvements
140
+
141
+ Potential enhancements:
142
+
143
+ * Mask instruction tokens during SFT
144
+ * Increase model size (1.5B+)
145
+ * Train on more diverse programming datasets
146
+ * Evaluate with pass@k benchmarks
147
+
148
+
149
+ ## Acknowledgements
150
+
151
+ * Base model by Qwen team
152
+ * Dataset by `iamtarun`
153
+