google
/

DiarizationLM-13b-Fisher-v1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

wq2012 commited on Jul 22, 2024

Commit

7e283c6

·

verified ·

1 Parent(s): d46affb

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ on the training subset of the Fisher corpus.
 This model is finetuned on the training subset of the Fisher corpus, using a LoRA adapter of rank 256. The total number of training parameters is 1,001,390,080. With a batch size of 16, this model has been trained for 12000 steps, which is ~4 epochs of the training data.
-We use the `mixed` flavor during our training, meaning we combine data from `hyp2ora` and `deg2ref flavors. After the prompt builder, we have a total of 48,142 prompt-completion pairs in our training set.
 The finetuning took more than 3 days on a Google Cloud VM instance that has one NVIDIA A100 GPU with 80GB memory.

 This model is finetuned on the training subset of the Fisher corpus, using a LoRA adapter of rank 256. The total number of training parameters is 1,001,390,080. With a batch size of 16, this model has been trained for 12000 steps, which is ~4 epochs of the training data.
+We use the `mixed` flavor during our training, meaning we combine data from `hyp2ora` and `deg2ref` flavors. After the prompt builder, we have a total of 48,142 prompt-completion pairs in our training set.
 The finetuning took more than 3 days on a Google Cloud VM instance that has one NVIDIA A100 GPU with 80GB memory.