File size: 1,072 Bytes
524cc42 5babecf 63231c1 524cc42 63231c1 5babecf 63231c1 5babecf 63231c1 5babecf 63231c1 5babecf 63231c1 5babecf 63231c1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
---
library_name: peft
license: apache-2.0
datasets:
- adriantheuma/raven-data
language:
- en
---
### Training details
* Prompt tokenisation: [LlamaTokenizer](https://huggingface.co/docs/transformers/model_doc/llama2#transformers.LlamaTokenizer).
* Maximum context length: 1,204 tokens
* Per device train batch: 1
* Gradient accumulation: 128 steps (achieving the equivalent batch_size of 128)
* Quantisation: 8-bit
* Optimiser: adamw
* Learning_rate: 3 × 10−4
* warmup_steps: 100
* epochs: 5
* Low Rank Adaptation (LoRA)
* rank: 16
* alpha: 16
* dropout: 0.05
* target modules: q_proj, k_proj, v_proj, and o_proj
This setup reduces the trainable parameters to 26,214,400 or 0.2% of the base [Llama 2 13B Chat](https://huggingface.co/docs/transformers/model_doc/llama2) model.
### Training hardware
This model is trained on commodity hardware equipped with a:
* 13th Gen Intel(R) Core(TM) i7-13700KF CPU at 3.40 GHz
* 64 GB installed RAM
* NVIDIA GeForce RTX 4090 GPU with 24 GB onboard RAM.
The trained model consumed 100 GPU hours during training. |