lemonilia commited on
Commit
b88be3a
1 Parent(s): 28322a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -101,9 +101,9 @@ repetition penalty and low penalty range (about as long as the prior 2 messages)
101
 
102
  ## Training procedure
103
  [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) was used for training
104
- on a 4x NVidia A40 GPU cluster.
105
 
106
- The A40 GPU cluster has been graciously provided by [Arc Compute](https://www.arccompute.io/).
107
 
108
  The model has been trained as an 8-bit LoRA adapter, and
109
  it's so large because a LoRA rank of 256 was also used. The reasoning was that this
@@ -133,4 +133,4 @@ the base Mistral-7B-v0.1 model.
133
  For the second pass, the `lora_model_dir` option was used to continue finetuning on the LoRA
134
  adapter obtained from the first pass.
135
 
136
- Using 4 GPUs, the effective global batch size would have been 128.
 
101
 
102
  ## Training procedure
103
  [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) was used for training
104
+ on 2x NVidia A40 GPUs.
105
 
106
+ The A40 GPUs have been graciously provided by [Arc Compute](https://www.arccompute.io/).
107
 
108
  The model has been trained as an 8-bit LoRA adapter, and
109
  it's so large because a LoRA rank of 256 was also used. The reasoning was that this
 
133
  For the second pass, the `lora_model_dir` option was used to continue finetuning on the LoRA
134
  adapter obtained from the first pass.
135
 
136
+ Using 2 GPUs, the effective global batch size would have been 128.