galkowskim
commited on
Commit
•
ef7f77c
1
Parent(s):
d03354e
Training complete
Browse files- README.md +53 -0
- test_metrics.json +4 -0
- train_losses.csv +111 -0
README.md
ADDED
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
base_model: allenai/longformer-base-4096
|
4 |
+
tags:
|
5 |
+
- generated_from_trainer
|
6 |
+
model-index:
|
7 |
+
- name: longformer_base_4096_QA_SQUAD
|
8 |
+
results: []
|
9 |
+
---
|
10 |
+
|
11 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
12 |
+
should probably proofread and complete it, then remove this comment. -->
|
13 |
+
|
14 |
+
# longformer_base_4096_QA_SQUAD
|
15 |
+
|
16 |
+
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on an unknown dataset.
|
17 |
+
|
18 |
+
## Model description
|
19 |
+
|
20 |
+
More information needed
|
21 |
+
|
22 |
+
## Intended uses & limitations
|
23 |
+
|
24 |
+
More information needed
|
25 |
+
|
26 |
+
## Training and evaluation data
|
27 |
+
|
28 |
+
More information needed
|
29 |
+
|
30 |
+
## Training procedure
|
31 |
+
|
32 |
+
### Training hyperparameters
|
33 |
+
|
34 |
+
The following hyperparameters were used during training:
|
35 |
+
- learning_rate: 2e-05
|
36 |
+
- train_batch_size: 8
|
37 |
+
- eval_batch_size: 8
|
38 |
+
- seed: 42
|
39 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
40 |
+
- lr_scheduler_type: linear
|
41 |
+
- num_epochs: 5
|
42 |
+
- mixed_precision_training: Native AMP
|
43 |
+
|
44 |
+
### Training results
|
45 |
+
|
46 |
+
|
47 |
+
|
48 |
+
### Framework versions
|
49 |
+
|
50 |
+
- Transformers 4.40.0
|
51 |
+
- Pytorch 2.2.1
|
52 |
+
- Datasets 2.19.0
|
53 |
+
- Tokenizers 0.19.1
|
test_metrics.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"exact_match": 84.68306527909176,
|
3 |
+
"f1": 91.79238618680928
|
4 |
+
}
|
train_losses.csv
ADDED
@@ -0,0 +1,111 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
loss,epoch
|
2 |
+
2.2082,0.045662100456621
|
3 |
+
1.2748,0.091324200913242
|
4 |
+
1.1634,0.136986301369863
|
5 |
+
1.0932,0.182648401826484
|
6 |
+
1.0833,0.228310502283105
|
7 |
+
1.0057,0.273972602739726
|
8 |
+
0.981,0.319634703196347
|
9 |
+
1.0403,0.365296803652968
|
10 |
+
0.9965,0.410958904109589
|
11 |
+
0.983,0.45662100456621
|
12 |
+
0.9687,0.502283105022831
|
13 |
+
1.0,0.547945205479452
|
14 |
+
0.9663,0.593607305936073
|
15 |
+
0.9629,0.639269406392694
|
16 |
+
0.8853,0.684931506849315
|
17 |
+
0.9325,0.730593607305936
|
18 |
+
0.9136,0.776255707762557
|
19 |
+
0.9216,0.821917808219178
|
20 |
+
0.9099,0.867579908675799
|
21 |
+
0.8797,0.91324200913242
|
22 |
+
0.8632,0.958904109589041
|
23 |
+
0.8619,1.004566210045662
|
24 |
+
0.7005,1.0502283105022832
|
25 |
+
0.6824,1.095890410958904
|
26 |
+
0.7055,1.1415525114155252
|
27 |
+
0.6921,1.187214611872146
|
28 |
+
0.7235,1.2328767123287672
|
29 |
+
0.6818,1.278538812785388
|
30 |
+
0.7069,1.3242009132420092
|
31 |
+
0.679,1.36986301369863
|
32 |
+
0.6634,1.4155251141552512
|
33 |
+
0.7179,1.461187214611872
|
34 |
+
0.6772,1.5068493150684932
|
35 |
+
0.7062,1.5525114155251143
|
36 |
+
0.7144,1.5981735159817352
|
37 |
+
0.6827,1.643835616438356
|
38 |
+
0.6945,1.6894977168949772
|
39 |
+
0.6747,1.7351598173515983
|
40 |
+
0.6775,1.7808219178082192
|
41 |
+
0.681,1.82648401826484
|
42 |
+
0.7223,1.8721461187214612
|
43 |
+
0.6732,1.9178082191780823
|
44 |
+
0.7043,1.9634703196347032
|
45 |
+
0.668,2.009132420091324
|
46 |
+
0.5202,2.0547945205479454
|
47 |
+
0.5198,2.1004566210045663
|
48 |
+
0.5111,2.146118721461187
|
49 |
+
0.4963,2.191780821917808
|
50 |
+
0.5084,2.237442922374429
|
51 |
+
0.5117,2.2831050228310503
|
52 |
+
0.5256,2.328767123287671
|
53 |
+
0.5166,2.374429223744292
|
54 |
+
0.5223,2.4200913242009134
|
55 |
+
0.5246,2.4657534246575343
|
56 |
+
0.5214,2.5114155251141552
|
57 |
+
0.5277,2.557077625570776
|
58 |
+
0.4875,2.602739726027397
|
59 |
+
0.517,2.6484018264840183
|
60 |
+
0.5523,2.6940639269406392
|
61 |
+
0.5067,2.73972602739726
|
62 |
+
0.522,2.7853881278538815
|
63 |
+
0.5136,2.8310502283105023
|
64 |
+
0.517,2.8767123287671232
|
65 |
+
0.4843,2.922374429223744
|
66 |
+
0.5241,2.968036529680365
|
67 |
+
0.4638,3.0136986301369864
|
68 |
+
0.3786,3.0593607305936072
|
69 |
+
0.3915,3.105022831050228
|
70 |
+
0.384,3.1506849315068495
|
71 |
+
0.3935,3.1963470319634704
|
72 |
+
0.3693,3.2420091324200913
|
73 |
+
0.3741,3.287671232876712
|
74 |
+
0.3624,3.3333333333333335
|
75 |
+
0.3793,3.3789954337899544
|
76 |
+
0.3854,3.4246575342465753
|
77 |
+
0.3777,3.470319634703196
|
78 |
+
0.3866,3.5159817351598175
|
79 |
+
0.4034,3.5616438356164384
|
80 |
+
0.409,3.6073059360730593
|
81 |
+
0.3762,3.65296803652968
|
82 |
+
0.3732,3.6986301369863015
|
83 |
+
0.3622,3.7442922374429224
|
84 |
+
0.3641,3.7899543378995433
|
85 |
+
0.3814,3.8356164383561646
|
86 |
+
0.3862,3.8812785388127855
|
87 |
+
0.3941,3.9269406392694064
|
88 |
+
0.3603,3.9726027397260273
|
89 |
+
0.3511,4.018264840182648
|
90 |
+
0.2943,4.063926940639269
|
91 |
+
0.2776,4.109589041095891
|
92 |
+
0.2889,4.155251141552512
|
93 |
+
0.2925,4.200913242009133
|
94 |
+
0.29,4.2465753424657535
|
95 |
+
0.2784,4.292237442922374
|
96 |
+
0.2877,4.337899543378995
|
97 |
+
0.2766,4.383561643835616
|
98 |
+
0.3058,4.429223744292237
|
99 |
+
0.2767,4.474885844748858
|
100 |
+
0.2683,4.52054794520548
|
101 |
+
0.2954,4.566210045662101
|
102 |
+
0.2652,4.6118721461187215
|
103 |
+
0.2898,4.657534246575342
|
104 |
+
0.2987,4.703196347031963
|
105 |
+
0.3012,4.748858447488584
|
106 |
+
0.2832,4.794520547945205
|
107 |
+
0.2981,4.840182648401827
|
108 |
+
0.3067,4.885844748858448
|
109 |
+
0.2899,4.931506849315069
|
110 |
+
0.3128,4.9771689497716896
|
111 |
+
0.5844105368139537,5.0
|