shahidul034
/

KUETLLM_zephyr

Generated from Trainer

Model card Files Files and versions Community

shahidul034 commited on Nov 25, 2023

Commit

9e77dc2

·

1 Parent(s): 459788c

End of training

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
-license: apache-2.0
-base_model: TheBloke/Mistral-7B-v0.1-GPTQ
 tags:
 - generated_from_trainer
 model-index:
@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 # KUETLLM_zephyr
-This model is a fine-tuned version of [TheBloke/Mistral-7B-v0.1-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GPTQ) on the None dataset.
 ## Model description
@@ -33,13 +33,13 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
 - num_epochs: 2
 - mixed_precision_training: Native AMP

 ---
+license: mit
+base_model: TheBloke/zephyr-7B-beta-GPTQ
 tags:
 - generated_from_trainer
 model-index:
 # KUETLLM_zephyr
+This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on the None dataset.
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 24
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 96
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
 - num_epochs: 2
 - mixed_precision_training: Native AMP