mcanoglu commited on
Commit
2a3633a
1 Parent(s): 3ea37a5

End of training

Browse files
README.md CHANGED
@@ -5,7 +5,6 @@ tags:
5
  - generated_from_trainer
6
  metrics:
7
  - accuracy
8
- - f1
9
  - precision
10
  - recall
11
  model-index:
@@ -20,11 +19,10 @@ should probably proofread and complete it, then remove this comment. -->
20
 
21
  This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 0.5777
24
- - Accuracy: 0.7586
25
- - F1: 0.7499
26
- - Precision: 0.7513
27
- - Recall: 0.7586
28
 
29
  ## Model description
30
 
@@ -44,28 +42,30 @@ More information needed
44
 
45
  The following hyperparameters were used during training:
46
  - learning_rate: 2e-05
47
- - train_batch_size: 2
48
- - eval_batch_size: 2
49
  - seed: 4711
50
- - gradient_accumulation_steps: 16
51
  - total_train_batch_size: 32
52
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
53
  - lr_scheduler_type: linear
54
- - num_epochs: 3
55
  - mixed_precision_training: Native AMP
56
 
57
  ### Training results
58
 
59
- | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
60
- |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
61
- | No log | 1.0 | 462 | 0.4832 | 0.7743 | 0.7594 | 0.7720 | 0.7743 |
62
- | 0.5829 | 2.0 | 924 | 0.4705 | 0.7788 | 0.7700 | 0.7737 | 0.7788 |
63
- | 0.3078 | 3.0 | 1386 | 0.5777 | 0.7586 | 0.7499 | 0.7513 | 0.7586 |
 
 
64
 
65
 
66
  ### Framework versions
67
 
68
- - Transformers 4.37.0
69
- - Pytorch 2.1.2+cu121
70
- - Datasets 2.16.1
71
- - Tokenizers 0.15.1
 
5
  - generated_from_trainer
6
  metrics:
7
  - accuracy
 
8
  - precision
9
  - recall
10
  model-index:
 
19
 
20
  This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.6902
23
+ - Accuracy: 0.7715
24
+ - Precision: 0.8036
25
+ - Recall: 0.5867
 
26
 
27
  ## Model description
28
 
 
42
 
43
  The following hyperparameters were used during training:
44
  - learning_rate: 2e-05
45
+ - train_batch_size: 8
46
+ - eval_batch_size: 8
47
  - seed: 4711
48
+ - gradient_accumulation_steps: 4
49
  - total_train_batch_size: 32
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
  - lr_scheduler_type: linear
52
+ - num_epochs: 5
53
  - mixed_precision_training: Native AMP
54
 
55
  ### Training results
56
 
57
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall |
58
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|
59
+ | No log | 1.0 | 462 | 0.4904 | 0.7800 | 0.6028 | 0.5178 |
60
+ | 0.5739 | 2.0 | 925 | 0.4917 | 0.7985 | 0.8159 | 0.5552 |
61
+ | 0.3111 | 3.0 | 1387 | 0.6582 | 0.7918 | 0.7907 | 0.5901 |
62
+ | 0.2395 | 4.0 | 1850 | 0.6238 | 0.7800 | 0.8018 | 0.6132 |
63
+ | 0.2047 | 4.99 | 2310 | 0.6902 | 0.7715 | 0.8036 | 0.5867 |
64
 
65
 
66
  ### Framework versions
67
 
68
+ - Transformers 4.38.1
69
+ - Pytorch 2.2.0+cu121
70
+ - Datasets 2.17.1
71
+ - Tokenizers 0.15.2
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a209cfe6427ea97e774bc64ad0d1a2e36d201e528ad7a3957d568d7aad544fc5
3
  size 4986380064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4517032bacda9d229aaec266da3b3315be027e458708b004757aa9f853307bde
3
  size 4986380064
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:37a2bfdc5d4bac777ff692b2a728046862284a7f87ae73e9455be85302563498
3
  size 135332592
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3346e3051767df587fa5782269d01bb45c6b1ca6dcf5479cd2ed922c3ddf08f6
3
  size 135332592