alexander-hm commited on
Commit
fcd9655
·
verified ·
1 Parent(s): 742fd08

End of training

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [google/gemma-7b](https://huggingface.co/google/gemma-7b) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 2.1320
20
 
21
  ## Model description
22
 
@@ -50,17 +50,17 @@ The following hyperparameters were used during training:
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:------:|:----:|:---------------:|
53
- | 1.7307 | 0.0018 | 1 | 1.9683 |
54
- | 1.9718 | 0.3392 | 187 | 1.6087 |
55
- | 1.3609 | 0.6783 | 374 | 1.6290 |
56
- | 1.2325 | 1.0175 | 561 | 1.6883 |
57
- | 1.2459 | 1.3566 | 748 | 1.8018 |
58
- | 1.1002 | 1.6958 | 935 | 1.7779 |
59
- | 0.6777 | 2.0349 | 1122 | 1.9851 |
60
- | 0.8645 | 2.3741 | 1309 | 1.8934 |
61
- | 0.9807 | 2.7132 | 1496 | 2.0880 |
62
- | 0.5884 | 3.0524 | 1683 | 2.1729 |
63
- | 0.5434 | 3.3915 | 1870 | 2.1708 |
64
 
65
 
66
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [google/gemma-7b](https://huggingface.co/google/gemma-7b) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 2.1333
20
 
21
  ## Model description
22
 
 
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:------:|:----:|:---------------:|
53
+ | 1.7307 | 0.0018 | 1 | 1.9676 |
54
+ | 1.9449 | 0.3392 | 187 | 1.6057 |
55
+ | 1.3608 | 0.6783 | 374 | 1.6356 |
56
+ | 1.3231 | 1.0175 | 561 | 1.6897 |
57
+ | 1.2619 | 1.3566 | 748 | 1.8127 |
58
+ | 1.0971 | 1.6958 | 935 | 1.7752 |
59
+ | 0.5605 | 2.0349 | 1122 | 1.9491 |
60
+ | 0.9008 | 2.3741 | 1309 | 1.8904 |
61
+ | 0.9005 | 2.7132 | 1496 | 2.0851 |
62
+ | 0.6184 | 3.0524 | 1683 | 2.1799 |
63
+ | 0.5547 | 3.3915 | 1870 | 2.1523 |
64
 
65
 
66
  ### Framework versions
all_results.json CHANGED
@@ -1,12 +1,12 @@
1
  {
2
  "epoch": 3.4005894355021535,
3
- "eval_loss": 2.131979465484619,
4
- "eval_runtime": 735.264,
5
- "eval_samples_per_second": 1.36,
6
- "eval_steps_per_second": 1.36,
7
  "total_flos": 4.294050624236544e+17,
8
- "train_loss": 1.190023433113098,
9
- "train_runtime": 127280.0549,
10
- "train_samples_per_second": 0.236,
11
- "train_steps_per_second": 0.015
12
  }
 
1
  {
2
  "epoch": 3.4005894355021535,
3
+ "eval_loss": 2.1332547664642334,
4
+ "eval_runtime": 752.9138,
5
+ "eval_samples_per_second": 1.328,
6
+ "eval_steps_per_second": 1.328,
7
  "total_flos": 4.294050624236544e+17,
8
+ "train_loss": 1.1927156732559203,
9
+ "train_runtime": 129494.4046,
10
+ "train_samples_per_second": 0.232,
11
+ "train_steps_per_second": 0.014
12
  }
eval_results.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "epoch": 3.4005894355021535,
3
- "eval_loss": 2.131979465484619,
4
- "eval_runtime": 735.264,
5
- "eval_samples_per_second": 1.36,
6
- "eval_steps_per_second": 1.36
7
  }
 
1
  {
2
  "epoch": 3.4005894355021535,
3
+ "eval_loss": 2.1332547664642334,
4
+ "eval_runtime": 752.9138,
5
+ "eval_samples_per_second": 1.328,
6
+ "eval_steps_per_second": 1.328
7
  }
metrics.json CHANGED
@@ -1 +1 @@
1
- {"run_name": "google/gemma-7b_oasst1_l0.0002_32,8,8,8,8", "train_runtime": 127280.0549, "train_samples_per_second": 0.236, "train_steps_per_second": 0.015, "total_flos": 4.294050624236544e+17, "train_loss": 1.190023433113098, "epoch": 3.4005894355021535, "eval_loss": 2.131979465484619, "eval_runtime": 735.264, "eval_samples_per_second": 1.36, "eval_steps_per_second": 1.36}
 
1
+ {"run_name": "google/gemma-7b_oasst1_l0.0002_32,8,8,8,8", "train_runtime": 129494.4046, "train_samples_per_second": 0.232, "train_steps_per_second": 0.014, "total_flos": 4.294050624236544e+17, "train_loss": 1.1927156732559203, "epoch": 3.4005894355021535, "eval_loss": 2.1332547664642334, "eval_runtime": 752.9138, "eval_samples_per_second": 1.328, "eval_steps_per_second": 1.328}
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 3.4005894355021535,
3
  "total_flos": 4.294050624236544e+17,
4
- "train_loss": 1.190023433113098,
5
- "train_runtime": 127280.0549,
6
- "train_samples_per_second": 0.236,
7
- "train_steps_per_second": 0.015
8
  }
 
1
  {
2
  "epoch": 3.4005894355021535,
3
  "total_flos": 4.294050624236544e+17,
4
+ "train_loss": 1.1927156732559203,
5
+ "train_runtime": 129494.4046,
6
+ "train_samples_per_second": 0.232,
7
+ "train_steps_per_second": 0.014
8
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff