tharunkrishna1611 commited on
Commit
301d547
1 Parent(s): e14108b

End of training

Browse files
Files changed (2) hide show
  1. README.md +3 -5
  2. adapter_model.safetensors +1 -1
README.md CHANGED
@@ -14,7 +14,7 @@ model-index:
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/nitktharun/huggingface/runs/k4i26uz4)
18
  # deepseek-coder-7b-instruct-v1.5_finetune
19
 
20
  This model is a fine-tuned version of [deepseek-ai/deepseek-coder-7b-instruct-v1.5](https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5) on the None dataset.
@@ -37,14 +37,12 @@ More information needed
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 0.0002
40
- - train_batch_size: 4
41
  - eval_batch_size: 8
42
  - seed: 42
43
- - gradient_accumulation_steps: 4
44
- - total_train_batch_size: 16
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: cosine
47
- - training_steps: 10
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/nitktharun/huggingface/runs/im096k0p)
18
  # deepseek-coder-7b-instruct-v1.5_finetune
19
 
20
  This model is a fine-tuned version of [deepseek-ai/deepseek-coder-7b-instruct-v1.5](https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5) on the None dataset.
 
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 0.0002
40
+ - train_batch_size: 1
41
  - eval_batch_size: 8
42
  - seed: 42
 
 
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: cosine
45
+ - training_steps: 250
46
  - mixed_precision_training: Native AMP
47
 
48
  ### Training results
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4af6e31bc41e710ceac9e70393eff93559c8bac1f402f9e9cadcd864a7f6b46c
3
  size 23609048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c781c0e8593487039c40a00e7a26479b1c04711ac24de0a865d2a56fda18f122
3
  size 23609048