Casper0508 commited on
Commit
e88703d
1 Parent(s): 3e9e86d

End of training

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 1.4554
20
 
21
  ## Model description
22
 
@@ -32,20 +32,6 @@ More information needed
32
 
33
  ## Training procedure
34
 
35
-
36
- The following `bitsandbytes` quantization config was used during training:
37
- - quant_method: bitsandbytes
38
- - _load_in_8bit: False
39
- - _load_in_4bit: True
40
- - llm_int8_threshold: 6.0
41
- - llm_int8_skip_modules: None
42
- - llm_int8_enable_fp32_cpu_offload: False
43
- - llm_int8_has_fp16_weight: False
44
- - bnb_4bit_quant_type: nf4
45
- - bnb_4bit_use_double_quant: True
46
- - bnb_4bit_compute_dtype: bfloat16
47
- - load_in_4bit: True
48
- - load_in_8bit: False
49
  ### Training hyperparameters
50
 
51
  The following hyperparameters were used during training:
@@ -64,31 +50,31 @@ The following hyperparameters were used during training:
64
 
65
  | Training Loss | Epoch | Step | Validation Loss |
66
  |:-------------:|:-----:|:----:|:---------------:|
67
- | 3.4063 | 1.36 | 10 | 2.0249 |
68
- | 1.4234 | 2.71 | 20 | 1.1088 |
69
- | 0.9874 | 4.07 | 30 | 0.8900 |
70
- | 0.7207 | 5.42 | 40 | 0.6961 |
71
- | 0.5784 | 6.78 | 50 | 0.6823 |
72
- | 0.5088 | 8.14 | 60 | 0.6767 |
73
- | 0.4453 | 9.49 | 70 | 0.7067 |
74
- | 0.3935 | 10.85 | 80 | 0.7432 |
75
- | 0.3417 | 12.2 | 90 | 0.8008 |
76
- | 0.3026 | 13.56 | 100 | 0.9167 |
77
- | 0.2754 | 14.92 | 110 | 0.9432 |
78
- | 0.2507 | 16.27 | 120 | 0.9834 |
79
- | 0.2359 | 17.63 | 130 | 1.0581 |
80
- | 0.2213 | 18.98 | 140 | 1.1612 |
81
- | 0.2075 | 20.34 | 150 | 1.1553 |
82
- | 0.2011 | 21.69 | 160 | 1.3062 |
83
- | 0.1959 | 23.05 | 170 | 1.3247 |
84
- | 0.1891 | 24.41 | 180 | 1.3318 |
85
- | 0.1865 | 25.76 | 190 | 1.3603 |
86
- | 0.1825 | 27.12 | 200 | 1.3980 |
87
- | 0.1797 | 28.47 | 210 | 1.4180 |
88
- | 0.178 | 29.83 | 220 | 1.4311 |
89
- | 0.176 | 31.19 | 230 | 1.4476 |
90
- | 0.1748 | 32.54 | 240 | 1.4538 |
91
- | 0.1753 | 33.9 | 250 | 1.4554 |
92
 
93
 
94
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 1.5909
20
 
21
  ## Model description
22
 
 
32
 
33
  ## Training procedure
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
 
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:-----:|:----:|:---------------:|
53
+ | 3.3698 | 1.36 | 10 | 2.0432 |
54
+ | 1.3777 | 2.71 | 20 | 1.0067 |
55
+ | 0.8126 | 4.07 | 30 | 0.7822 |
56
+ | 0.6642 | 5.42 | 40 | 0.7281 |
57
+ | 0.5708 | 6.78 | 50 | 0.7218 |
58
+ | 0.5062 | 8.14 | 60 | 0.7360 |
59
+ | 0.4379 | 9.49 | 70 | 0.7781 |
60
+ | 0.3924 | 10.85 | 80 | 0.8310 |
61
+ | 0.3435 | 12.2 | 90 | 0.8856 |
62
+ | 0.3041 | 13.56 | 100 | 1.0389 |
63
+ | 0.2787 | 14.92 | 110 | 1.0664 |
64
+ | 0.2553 | 16.27 | 120 | 1.1655 |
65
+ | 0.2388 | 17.63 | 130 | 1.2397 |
66
+ | 0.2288 | 18.98 | 140 | 1.2049 |
67
+ | 0.2128 | 20.34 | 150 | 1.2746 |
68
+ | 0.2081 | 21.69 | 160 | 1.3889 |
69
+ | 0.1998 | 23.05 | 170 | 1.3942 |
70
+ | 0.1909 | 24.41 | 180 | 1.4383 |
71
+ | 0.188 | 25.76 | 190 | 1.5012 |
72
+ | 0.1841 | 27.12 | 200 | 1.5246 |
73
+ | 0.18 | 28.47 | 210 | 1.5528 |
74
+ | 0.1794 | 29.83 | 220 | 1.5662 |
75
+ | 0.1773 | 31.19 | 230 | 1.5788 |
76
+ | 0.1751 | 32.54 | 240 | 1.5889 |
77
+ | 0.1756 | 33.9 | 250 | 1.5909 |
78
 
79
 
80
  ### Framework versions
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0ea850e17a642a80a9f6054cba639a863c38fd5ee587fc288a24e2a510e28b46
3
  size 151020944
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c9382c10b3a9a99504ecf742898b679ee9fc30cdb63e8d8ff5303808f175af02
3
  size 151020944
emissions.csv CHANGED
@@ -1,2 +1,2 @@
1
  timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2
- 2024-07-29T16:59:42,0b1b335d-594e-49e2-84c9-dbc256266dc6,codecarbon,1422.8740487098694,0.08038502085419474,0.11960115720427114,United Kingdom,GBR,scotland,N,,
 
1
  timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2
+ 2024-07-29T17:37:10,0935c600-22ba-4896-8b40-c37f98e81ea9,codecarbon,653.414971113205,0.03352802686025582,0.04988480152957785,United Kingdom,GBR,scotland,N,,
runs/Jul29_17-26-10_msc-modeltrain-pod/events.out.tfevents.1722273977.msc-modeltrain-pod.12449.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:21307b2f5d605047cf945e79efa0883a46ce843dd6d3f0e797c3b0b901fc2b89
3
+ size 17035
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:73cf781d850b973106fdd5e079cb1a7baf30c96f78fa7dac138c5e1e1cf3d9a6
3
  size 4984
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b0e981c0bd2c889d8920b8e3a15191e82ae2084b87be9b4415afb23e05669c4
3
  size 4984