zodiache commited on
Commit
6bf100d
·
verified ·
1 Parent(s): df380d3

Model save

Browse files
Files changed (4) hide show
  1. README.md +12 -22
  2. adapter_model.safetensors +1 -1
  3. all_results.json +6 -6
  4. train_results.json +6 -6
README.md CHANGED
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.2908
22
 
23
  ## Model description
24
 
@@ -46,32 +46,22 @@ The following hyperparameters were used during training:
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
  - lr_scheduler_warmup_steps: 100
49
- - training_steps: 2048
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:------:|:----:|:---------------:|
55
- | 0.0815 | 0.1892 | 100 | 0.2948 |
56
- | 0.0434 | 0.3784 | 200 | 0.2018 |
57
- | 0.0456 | 0.5676 | 300 | 0.2325 |
58
- | 0.0303 | 0.7569 | 400 | 0.1661 |
59
- | 0.0743 | 0.9461 | 500 | 0.1364 |
60
- | 0.0324 | 1.1353 | 600 | 0.1452 |
61
- | 0.0255 | 1.3245 | 700 | 0.2203 |
62
- | 0.0372 | 1.5137 | 800 | 0.2048 |
63
- | 0.0236 | 1.7029 | 900 | 0.2011 |
64
- | 0.002 | 1.8921 | 1000 | 0.2422 |
65
- | 0.0107 | 2.0814 | 1100 | 0.2662 |
66
- | 0.0099 | 2.2706 | 1200 | 0.2508 |
67
- | 0.0199 | 2.4598 | 1300 | 0.3019 |
68
- | 0.005 | 2.6490 | 1400 | 0.2671 |
69
- | 0.0297 | 2.8382 | 1500 | 0.2541 |
70
- | 0.0011 | 3.0274 | 1600 | 0.2923 |
71
- | 0.0152 | 3.2167 | 1700 | 0.2680 |
72
- | 0.0007 | 3.4059 | 1800 | 0.2882 |
73
- | 0.0098 | 3.5951 | 1900 | 0.2746 |
74
- | 0.0251 | 3.7843 | 2000 | 0.2908 |
75
 
76
 
77
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.0833
22
 
23
  ## Model description
24
 
 
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
  - lr_scheduler_warmup_steps: 100
49
+ - training_steps: 1024
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:------:|:----:|:---------------:|
55
+ | 0.166 | 0.1110 | 100 | 0.1586 |
56
+ | 0.0995 | 0.2220 | 200 | 0.1237 |
57
+ | 0.1139 | 0.3330 | 300 | 0.0977 |
58
+ | 0.0505 | 0.4440 | 400 | 0.0790 |
59
+ | 0.029 | 0.5550 | 500 | 0.1129 |
60
+ | 0.047 | 0.6660 | 600 | 0.0783 |
61
+ | 0.052 | 0.7770 | 700 | 0.0841 |
62
+ | 0.0416 | 0.8880 | 800 | 0.0718 |
63
+ | 0.0349 | 0.9990 | 900 | 0.0810 |
64
+ | 0.0618 | 1.1100 | 1000 | 0.0833 |
 
 
 
 
 
 
 
 
 
 
65
 
66
 
67
  ### Framework versions
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:28b91b282269524bb58185924d517bc839967735869215cb8a8ef5222b280ebc
3
  size 2115012328
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7efc50c3c5a1c66159bbcd087a587e7d69205edefe756fdaec361309f1b1a4b4
3
  size 2115012328
all_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 3.8751182592242195,
3
- "total_flos": 1.809303241400451e+18,
4
- "train_loss": 0.11459405875015705,
5
- "train_runtime": 18906.0633,
6
- "train_samples_per_second": 6.933,
7
- "train_steps_per_second": 0.108
8
  }
 
1
  {
2
+ "epoch": 1.136672679339531,
3
+ "total_flos": 7.835717038709146e+17,
4
+ "train_loss": 0.22367067684899666,
5
+ "train_runtime": 8651.0001,
6
+ "train_samples_per_second": 7.576,
7
+ "train_steps_per_second": 0.118
8
  }
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 3.8751182592242195,
3
- "total_flos": 1.809303241400451e+18,
4
- "train_loss": 0.11459405875015705,
5
- "train_runtime": 18906.0633,
6
- "train_samples_per_second": 6.933,
7
- "train_steps_per_second": 0.108
8
  }
 
1
  {
2
+ "epoch": 1.136672679339531,
3
+ "total_flos": 7.835717038709146e+17,
4
+ "train_loss": 0.22367067684899666,
5
+ "train_runtime": 8651.0001,
6
+ "train_samples_per_second": 7.576,
7
+ "train_steps_per_second": 0.118
8
  }