GlycerinLOL commited on
Commit
0b24e67
1 Parent(s): c103420

End of training

Browse files
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- base_model: google/pegasus-xsum
3
  tags:
4
  - generated_from_trainer
5
  metrics:
@@ -14,14 +13,14 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # LLM_Teached_Pegasus
16
 
17
- This model is a fine-tuned version of [google/pegasus-xsum](https://huggingface.co/google/pegasus-xsum) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 1.7905
20
- - Rouge1: 0.4388
21
- - Rouge2: 0.1916
22
- - Rougel: 0.3479
23
- - Rougelsum: 0.3476
24
- - Gen Len: 28.7182
25
 
26
  ## Model description
27
 
@@ -41,9 +40,11 @@ More information needed
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 2e-05
44
- - train_batch_size: 16
45
  - eval_batch_size: 8
46
  - seed: 42
 
 
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
49
  - num_epochs: 2
@@ -53,8 +54,8 @@ The following hyperparameters were used during training:
53
 
54
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
55
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
56
- | 2.0092 | 1.0 | 1250 | 1.8228 | 0.4351 | 0.188 | 0.3414 | 0.3411 | 28.7045 |
57
- | 1.8992 | 2.0 | 2500 | 1.7905 | 0.4388 | 0.1916 | 0.3479 | 0.3476 | 28.7182 |
58
 
59
 
60
  ### Framework versions
 
1
  ---
 
2
  tags:
3
  - generated_from_trainer
4
  metrics:
 
13
 
14
  # LLM_Teached_Pegasus
15
 
16
+ This model was trained from scratch on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 1.6452
19
+ - Rouge1: 0.4595
20
+ - Rouge2: 0.2033
21
+ - Rougel: 0.3629
22
+ - Rougelsum: 0.3628
23
+ - Gen Len: 30.8536
24
 
25
  ## Model description
26
 
 
40
 
41
  The following hyperparameters were used during training:
42
  - learning_rate: 2e-05
43
+ - train_batch_size: 8
44
  - eval_batch_size: 8
45
  - seed: 42
46
+ - gradient_accumulation_steps: 4
47
+ - total_train_batch_size: 32
48
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
  - lr_scheduler_type: linear
50
  - num_epochs: 2
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
56
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
57
+ | 1.7637 | 1.0 | 625 | 1.6549 | 0.4591 | 0.205 | 0.3628 | 0.3628 | 30.8636 |
58
+ | 1.7226 | 2.0 | 1250 | 1.6452 | 0.4595 | 0.2033 | 0.3629 | 0.3628 | 30.8536 |
59
 
60
 
61
  ### Framework versions
generation_config.json CHANGED
@@ -3,8 +3,8 @@
3
  "decoder_start_token_id": 0,
4
  "eos_token_id": 1,
5
  "forced_eos_token_id": 1,
6
- "length_penalty": 0.6,
7
- "max_length": 64,
8
  "num_beams": 8,
9
  "pad_token_id": 0,
10
  "transformers_version": "4.36.0"
 
3
  "decoder_start_token_id": 0,
4
  "eos_token_id": 1,
5
  "forced_eos_token_id": 1,
6
+ "length_penalty": 0.8,
7
+ "max_length": 256,
8
  "num_beams": 8,
9
  "pad_token_id": 0,
10
  "transformers_version": "4.36.0"
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1609b09f05b64a32b53d0ad81dd41f022a9854f7f32b8cd32a0ea25e24042cea
3
  size 2283652852
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab52b4b7f1b050bd880fa12529d3c148fdd68fce219a518cfcb8e630a158c091
3
  size 2283652852
runs/Dec28_11-12-24_n4bcoectr1703727001286-fmclw/events.out.tfevents.1703733148.n4bcoectr1703727001286-fmclw.64933.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a65f50becec57e756077bf6804400342a4c6426d8b155c0eba55c0325aef9fe5
3
- size 7836
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2149872a8dbcce347ede2e31bda798bf8f519bb64566837cdbe635dbd35b028b
3
+ size 8715