End of training

Browse files

Files changed (5) hide show

README.md +26 -40
adapter_model.safetensors +1 -1
emissions.csv +1 -1
runs/Jul29_17-26-10_msc-modeltrain-pod/events.out.tfevents.1722273977.msc-modeltrain-pod.12449.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4554
 ## Model description
@@ -32,20 +32,6 @@ More information needed
 ## Training procedure
-The following `bitsandbytes` quantization config was used during training:
-- quant_method: bitsandbytes
-- _load_in_8bit: False
-- _load_in_4bit: True
-- llm_int8_threshold: 6.0
-- llm_int8_skip_modules: None
-- llm_int8_enable_fp32_cpu_offload: False
-- llm_int8_has_fp16_weight: False
-- bnb_4bit_quant_type: nf4
-- bnb_4bit_use_double_quant: True
-- bnb_4bit_compute_dtype: bfloat16
-- load_in_4bit: True
-- load_in_8bit: False
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -64,31 +50,31 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.4063        | 1.36  | 10   | 2.0249          |
-| 1.4234        | 2.71  | 20   | 1.1088          |
-| 0.9874        | 4.07  | 30   | 0.8900          |
-| 0.7207        | 5.42  | 40   | 0.6961          |
-| 0.5784        | 6.78  | 50   | 0.6823          |
-| 0.5088        | 8.14  | 60   | 0.6767          |
-| 0.4453        | 9.49  | 70   | 0.7067          |
-| 0.3935        | 10.85 | 80   | 0.7432          |
-| 0.3417        | 12.2  | 90   | 0.8008          |
-| 0.3026        | 13.56 | 100  | 0.9167          |
-| 0.2754        | 14.92 | 110  | 0.9432          |
-| 0.2507        | 16.27 | 120  | 0.9834          |
-| 0.2359        | 17.63 | 130  | 1.0581          |
-| 0.2213        | 18.98 | 140  | 1.1612          |
-| 0.2075        | 20.34 | 150  | 1.1553          |
-| 0.2011        | 21.69 | 160  | 1.3062          |
-| 0.1959        | 23.05 | 170  | 1.3247          |
-| 0.1891        | 24.41 | 180  | 1.3318          |
-| 0.1865        | 25.76 | 190  | 1.3603          |
-| 0.1825        | 27.12 | 200  | 1.3980          |
-| 0.1797        | 28.47 | 210  | 1.4180          |
-| 0.178         | 29.83 | 220  | 1.4311          |
-| 0.176         | 31.19 | 230  | 1.4476          |
-| 0.1748        | 32.54 | 240  | 1.4538          |
-| 0.1753        | 33.9  | 250  | 1.4554          |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.5909
 ## Model description
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.3698        | 1.36  | 10   | 2.0432          |
+| 1.3777        | 2.71  | 20   | 1.0067          |
+| 0.8126        | 4.07  | 30   | 0.7822          |
+| 0.6642        | 5.42  | 40   | 0.7281          |
+| 0.5708        | 6.78  | 50   | 0.7218          |
+| 0.5062        | 8.14  | 60   | 0.7360          |
+| 0.4379        | 9.49  | 70   | 0.7781          |
+| 0.3924        | 10.85 | 80   | 0.8310          |
+| 0.3435        | 12.2  | 90   | 0.8856          |
+| 0.3041        | 13.56 | 100  | 1.0389          |
+| 0.2787        | 14.92 | 110  | 1.0664          |
+| 0.2553        | 16.27 | 120  | 1.1655          |
+| 0.2388        | 17.63 | 130  | 1.2397          |
+| 0.2288        | 18.98 | 140  | 1.2049          |
+| 0.2128        | 20.34 | 150  | 1.2746          |
+| 0.2081        | 21.69 | 160  | 1.3889          |
+| 0.1998        | 23.05 | 170  | 1.3942          |
+| 0.1909        | 24.41 | 180  | 1.4383          |
+| 0.188         | 25.76 | 190  | 1.5012          |
+| 0.1841        | 27.12 | 200  | 1.5246          |
+| 0.18          | 28.47 | 210  | 1.5528          |
+| 0.1794        | 29.83 | 220  | 1.5662          |
+| 0.1773        | 31.19 | 230  | 1.5788          |
+| 0.1751        | 32.54 | 240  | 1.5889          |
+| 0.1756        | 33.9  | 250  | 1.5909          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0ea850e17a642a80a9f6054cba639a863c38fd5ee587fc288a24e2a510e28b46
 size 151020944

 version https://git-lfs.github.com/spec/v1
+oid sha256:c9382c10b3a9a99504ecf742898b679ee9fc30cdb63e8d8ff5303808f175af02
 size 151020944

emissions.csv CHANGED Viewed

	@@ -1,2 +1,2 @@
1	timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2	- 2024-07-~~29T16~~:59:42,~~0b1b335d~~-~~594e~~-~~49e2~~-~~84c9~~-~~dbc256266dc6~~,codecarbon,~~1422~~.~~8740487098694~~,0.~~08038502085419474~~,0.~~11960115720427114~~,United Kingdom,GBR,scotland,N,,


1	timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2	+ 2024-07-29T17:37:10,0935c600-22ba-4896-8b40-c37f98e81ea9,codecarbon,653.414971113205,0.03352802686025582,0.04988480152957785,United Kingdom,GBR,scotland,N,,

runs/Jul29_17-26-10_msc-modeltrain-pod/events.out.tfevents.1722273977.msc-modeltrain-pod.12449.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21307b2f5d605047cf945e79efa0883a46ce843dd6d3f0e797c3b0b901fc2b89
+size 17035

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:73cf781d850b973106fdd5e079cb1a7baf30c96f78fa7dac138c5e1e1cf3d9a6
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:6b0e981c0bd2c889d8920b8e3a15191e82ae2084b87be9b4415afb23e05669c4
 size 4984