willtensora commited on
Commit
f277955
·
verified ·
1 Parent(s): c4a931f

End of training

Browse files
Files changed (2) hide show
  1. README.md +6 -9
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -99,7 +99,7 @@ xformers_attention: null
99
 
100
  This model is a fine-tuned version of [peft-internal-testing/tiny-dummy-qwen2](https://huggingface.co/peft-internal-testing/tiny-dummy-qwen2) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
- - Loss: 11.9312
103
 
104
  ## Model description
105
 
@@ -122,11 +122,8 @@ The following hyperparameters were used during training:
122
  - train_batch_size: 2
123
  - eval_batch_size: 2
124
  - seed: 42
125
- - distributed_type: multi-GPU
126
- - num_devices: 8
127
  - gradient_accumulation_steps: 4
128
- - total_train_batch_size: 64
129
- - total_eval_batch_size: 16
130
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
131
  - lr_scheduler_type: cosine
132
  - lr_scheduler_warmup_steps: 10
@@ -136,10 +133,10 @@ The following hyperparameters were used during training:
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
- | 11.9309 | 0.0045 | 1 | 11.9313 |
140
- | 11.9313 | 0.0135 | 3 | 11.9313 |
141
- | 11.9318 | 0.0269 | 6 | 11.9313 |
142
- | 11.9313 | 0.0404 | 9 | 11.9312 |
143
 
144
 
145
  ### Framework versions
 
99
 
100
  This model is a fine-tuned version of [peft-internal-testing/tiny-dummy-qwen2](https://huggingface.co/peft-internal-testing/tiny-dummy-qwen2) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
+ - Loss: 11.9313
103
 
104
  ## Model description
105
 
 
122
  - train_batch_size: 2
123
  - eval_batch_size: 2
124
  - seed: 42
 
 
125
  - gradient_accumulation_steps: 4
126
+ - total_train_batch_size: 8
 
127
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
128
  - lr_scheduler_type: cosine
129
  - lr_scheduler_warmup_steps: 10
 
133
 
134
  | Training Loss | Epoch | Step | Validation Loss |
135
  |:-------------:|:------:|:----:|:---------------:|
136
+ | 11.9315 | 0.0006 | 1 | 11.9313 |
137
+ | 11.9319 | 0.0017 | 3 | 11.9313 |
138
+ | 11.926 | 0.0034 | 6 | 11.9313 |
139
+ | 11.9287 | 0.0050 | 9 | 11.9313 |
140
 
141
 
142
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:16ff41a2be4704585ad2a5d2d9b3e7ff90a390f28dd3d4fca4bff18af96b0b3c
3
  size 21378
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ccdf91ca6ec9da23a97ce6c01122b6f9e06ff99085f40868544a297d1516731
3
  size 21378