lemonilia commited on
Commit
7f1630d
1 Parent(s): a23656e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -81,9 +81,13 @@ your desired response length:
81
 
82
  ## Training procedure
83
  [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) was used for training
84
- on a single NVidia RTX3090 GPU. The model has been trained as a 4-bit LoRA adapter, which
85
- is so large because a LoRA rank of 256 was used. It's suggested to merge the adapter to
86
- the base Llama2-7B model (or other Llama2-based models).
 
 
 
 
87
 
88
  ### Training hyperparameters
89
  For the first pass these settings were used:
 
81
 
82
  ## Training procedure
83
  [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) was used for training
84
+ on a single NVidia RTX3090 GPU. The model has been trained as a 4-bit LoRA adapter, and
85
+ it's so large because a LoRA rank of 256 was also used. The reasoning was that this
86
+ might have helped the model internalize any newly acquired information, making the
87
+ training process closer to a full finetune.
88
+
89
+ It's suggested to merge the adapter to the base Llama2-7B model (or other Llama2-based
90
+ models).
91
 
92
  ### Training hyperparameters
93
  For the first pass these settings were used: