nightner commited on
Commit
689e8a1
·
verified ·
1 Parent(s): e0bf3cc

nightner/roberta2roberta_financial_lora_test

Browse files
Files changed (3) hide show
  1. README.md +22 -4
  2. adapter_model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -4,6 +4,8 @@ license: apache-2.0
4
  base_model: google/roberta2roberta_L-24_cnn_daily_mail
5
  tags:
6
  - generated_from_trainer
 
 
7
  model-index:
8
  - name: results
9
  results: []
@@ -15,6 +17,14 @@ should probably proofread and complete it, then remove this comment. -->
15
  # results
16
 
17
  This model is a fine-tuned version of [google/roberta2roberta_L-24_cnn_daily_mail](https://huggingface.co/google/roberta2roberta_L-24_cnn_daily_mail) on the None dataset.
 
 
 
 
 
 
 
 
18
 
19
  ## Model description
20
 
@@ -34,18 +44,26 @@ More information needed
34
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 3e-05
37
- - train_batch_size: 1
38
- - eval_batch_size: 1
39
  - seed: 42
40
  - gradient_accumulation_steps: 4
41
- - total_train_batch_size: 4
42
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: cosine
44
  - lr_scheduler_warmup_ratio: 0.1
45
- - num_epochs: 2
46
 
47
  ### Training results
48
 
 
 
 
 
 
 
 
 
49
 
50
 
51
  ### Framework versions
 
4
  base_model: google/roberta2roberta_L-24_cnn_daily_mail
5
  tags:
6
  - generated_from_trainer
7
+ metrics:
8
+ - rouge
9
  model-index:
10
  - name: results
11
  results: []
 
17
  # results
18
 
19
  This model is a fine-tuned version of [google/roberta2roberta_L-24_cnn_daily_mail](https://huggingface.co/google/roberta2roberta_L-24_cnn_daily_mail) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 3.7671
22
+ - Rouge1: 46.04
23
+ - Rouge2: 24.94
24
+ - Rougel: 33.78
25
+ - Bertscore P: 87.25
26
+ - Bertscore R: 85.03
27
+ - Bertscore F1: 86.02
28
 
29
  ## Model description
30
 
 
44
 
45
  The following hyperparameters were used during training:
46
  - learning_rate: 3e-05
47
+ - train_batch_size: 2
48
+ - eval_batch_size: 2
49
  - seed: 42
50
  - gradient_accumulation_steps: 4
51
+ - total_train_batch_size: 8
52
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
53
  - lr_scheduler_type: cosine
54
  - lr_scheduler_warmup_ratio: 0.1
55
+ - num_epochs: 5
56
 
57
  ### Training results
58
 
59
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Bertscore P | Bertscore R | Bertscore F1 |
60
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:-----------:|:-----------:|:------------:|
61
+ | No log | 0.8 | 10 | 4.1610 | 43.72 | 22.28 | 31.23 | 86.78 | 84.54 | 85.57 |
62
+ | No log | 1.56 | 20 | 4.0149 | 45.37 | 23.55 | 32.49 | 86.84 | 84.74 | 85.69 |
63
+ | No log | 2.32 | 30 | 3.8893 | 45.05 | 24.14 | 32.83 | 87.15 | 84.71 | 85.82 |
64
+ | No log | 3.08 | 40 | 3.8113 | 45.1 | 24.05 | 32.37 | 87.11 | 85.08 | 85.98 |
65
+ | 18.8832 | 3.88 | 50 | 3.7732 | 45.98 | 24.85 | 33.82 | 87.13 | 85.03 | 85.96 |
66
+ | 18.8832 | 4.64 | 60 | 3.7671 | 46.04 | 24.94 | 33.78 | 87.25 | 85.03 | 86.02 |
67
 
68
 
69
  ### Framework versions
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f20a5878c5a29172631a6d46392abdf34ad993f38f6d82c722b0de4039d4bff3
3
  size 12611432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a8d59f86d472fff550241f97e76b627a0bb06b176e8abbd4eedb26b1dbd72995
3
  size 12611432
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8dc5f579cc66a9036edaf3ee5dd83e61a062d15ef795941b759c9960498df95c
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9be7a562e9528b9f6ee315e6d75d62f3ea6982b49644339a058467afe79ae474
3
  size 5432