Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ on the training subset of the Fisher corpus.
|
|
16 |
|
17 |
This model is finetuned on the training subset of the Fisher corpus, using a LoRA adapter of rank 256. The total number of training parameters is 1,001,390,080. With a batch size of 16, this model has been trained for 12000 steps, which is ~4 epochs of the training data.
|
18 |
|
19 |
-
We use the `mixed` flavor during our training, meaning we combine data from `hyp2ora` and `deg2ref flavors. After the prompt builder, we have a total of 48,142 prompt-completion pairs in our training set.
|
20 |
|
21 |
The finetuning took more than 3 days on a Google Cloud VM instance that has one NVIDIA A100 GPU with 80GB memory.
|
22 |
|
|
|
16 |
|
17 |
This model is finetuned on the training subset of the Fisher corpus, using a LoRA adapter of rank 256. The total number of training parameters is 1,001,390,080. With a batch size of 16, this model has been trained for 12000 steps, which is ~4 epochs of the training data.
|
18 |
|
19 |
+
We use the `mixed` flavor during our training, meaning we combine data from `hyp2ora` and `deg2ref` flavors. After the prompt builder, we have a total of 48,142 prompt-completion pairs in our training set.
|
20 |
|
21 |
The finetuning took more than 3 days on a Google Cloud VM instance that has one NVIDIA A100 GPU with 80GB memory.
|
22 |
|