shahidul034 commited on
Commit
9e77dc2
·
1 Parent(s): 459788c

End of training

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
- license: apache-2.0
3
- base_model: TheBloke/Mistral-7B-v0.1-GPTQ
4
  tags:
5
  - generated_from_trainer
6
  model-index:
@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  # KUETLLM_zephyr
15
 
16
- This model is a fine-tuned version of [TheBloke/Mistral-7B-v0.1-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GPTQ) on the None dataset.
17
 
18
  ## Model description
19
 
@@ -33,13 +33,13 @@ More information needed
33
 
34
  The following hyperparameters were used during training:
35
  - learning_rate: 0.0002
36
- - train_batch_size: 8
37
  - eval_batch_size: 8
38
  - seed: 42
39
  - gradient_accumulation_steps: 4
40
- - total_train_batch_size: 32
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
- - lr_scheduler_type: cosine
43
  - num_epochs: 2
44
  - mixed_precision_training: Native AMP
45
 
 
1
  ---
2
+ license: mit
3
+ base_model: TheBloke/zephyr-7B-beta-GPTQ
4
  tags:
5
  - generated_from_trainer
6
  model-index:
 
13
 
14
  # KUETLLM_zephyr
15
 
16
+ This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on the None dataset.
17
 
18
  ## Model description
19
 
 
33
 
34
  The following hyperparameters were used during training:
35
  - learning_rate: 0.0002
36
+ - train_batch_size: 24
37
  - eval_batch_size: 8
38
  - seed: 42
39
  - gradient_accumulation_steps: 4
40
+ - total_train_batch_size: 96
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
+ - lr_scheduler_type: linear
43
  - num_epochs: 2
44
  - mixed_precision_training: Native AMP
45