Update README.md
Browse files
README.md
CHANGED
@@ -1,21 +1,31 @@
|
|
1 |
---
|
2 |
library_name: peft
|
3 |
---
|
4 |
-
|
5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
-
|
8 |
-
-
|
9 |
-
-
|
10 |
-
-
|
11 |
-
-
|
12 |
-
- llm_int8_enable_fp32_cpu_offload: False
|
13 |
-
- llm_int8_has_fp16_weight: False
|
14 |
-
- bnb_4bit_quant_type: fp4
|
15 |
-
- bnb_4bit_use_double_quant: False
|
16 |
-
- bnb_4bit_compute_dtype: float32
|
17 |
|
|
|
18 |
|
19 |
-
###
|
20 |
|
21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
library_name: peft
|
3 |
---
|
4 |
+
### Training details
|
5 |
|
6 |
+
- Prompt tokenisation: [LlamaTokenizer](https://huggingface.co/docs/transformers/model_doc/llama2#transformers.LlamaTokenizer).
|
7 |
+
- The maximum context length is limited to 1,204.
|
8 |
+
- Per device train batch: 1
|
9 |
+
- Gradient accumulation: 128 steps (achieving the equivalent batch_size of 128)
|
10 |
+
- Quantisation: 8-bit (
|
11 |
+
- Optimiser: adamw
|
12 |
+
- Learning_rate: 3 × 10−4
|
13 |
+
- warmup_steps: 100
|
14 |
+
- epochs: 5
|
15 |
|
16 |
+
- Low Rank Adaptation (LoRA)
|
17 |
+
- rank: 16
|
18 |
+
- alpha: 16
|
19 |
+
- dropout: 0.05
|
20 |
+
- target modules: q_proj, k_proj, v_proj, and o_proj
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
+
This setup reduces the trainable parameters to 26,214,400 or 0.2% of the base [Llama 2 13B Chat](https://huggingface.co/docs/transformers/model_doc/llama2) model.
|
23 |
|
24 |
+
### Training hardware
|
25 |
|
26 |
+
This model is trained on commodity hardware equipped with a:
|
27 |
+
- 13th Gen Intel(R) Core(TM) i7-13700KF CPU at 3.40 GHz
|
28 |
+
- 64 GB installed RAM
|
29 |
+
- NVIDIA GeForce RTX 4090 GPU with 24 GB onboard RAM.
|
30 |
+
|
31 |
+
The trained model consumed 100 GPU hours during training.
|