Edit model card

longformer-spans

This model is a fine-tuned version of allenai/longformer-base-4096 on the essays_su_g dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5306
  • B: {'precision': 0.8433835845896147, 'recall': 0.8887908208296558, 'f1-score': 0.8654920498495917, 'support': 1133.0}
  • I: {'precision': 0.9324545214869496, 'recall': 0.9645993563519337, 'f1-score': 0.9482545981017748, 'support': 18333.0}
  • O: {'precision': 0.9282833787465941, 'recall': 0.8630928252938792, 'f1-score': 0.8945019167148034, 'support': 9868.0}
  • Accuracy: 0.9275
  • Macro avg: {'precision': 0.9013738282743861, 'recall': 0.9054943341584897, 'f1-score': 0.90274952155539, 'support': 29334.0}
  • Weighted avg: {'precision': 0.9276110562907095, 'recall': 0.9275243744460353, 'f1-score': 0.9269754876123645, 'support': 29334.0}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss B I O Accuracy Macro avg Weighted avg
No log 1.0 81 0.2620 {'precision': 0.7461594732991953, 'recall': 0.9002647837599294, 'f1-score': 0.816, 'support': 1133.0} {'precision': 0.9024103768767235, 'recall': 0.9638902525500463, 'f1-score': 0.9321376763813793, 'support': 18333.0} {'precision': 0.931782945736434, 'recall': 0.7917511147142278, 'f1-score': 0.8560784528570645, 'support': 9868.0} 0.9035 {'precision': 0.860117598637451, 'recall': 0.8853020503414012, 'f1-score': 0.8680720430794812, 'support': 29334.0} {'precision': 0.9062562975065145, 'recall': 0.9035249198881844, 'f1-score': 0.9020655278480035, 'support': 29334.0}
No log 2.0 162 0.2253 {'precision': 0.8167072181670721, 'recall': 0.8887908208296558, 'f1-score': 0.8512256973795435, 'support': 1133.0} {'precision': 0.9152551099212274, 'recall': 0.9696721758577429, 'f1-score': 0.9416781438711729, 'support': 18333.0} {'precision': 0.9380041484212952, 'recall': 0.8248885285772193, 'f1-score': 0.8778173190984578, 'support': 9868.0} 0.9178 {'precision': 0.8899888255031981, 'recall': 0.8944505084215394, 'f1-score': 0.8902403867830581, 'support': 29334.0} {'precision': 0.9191015935430046, 'recall': 0.9178427763005387, 'f1-score': 0.9167016237671239, 'support': 29334.0}
No log 3.0 243 0.2279 {'precision': 0.8050117462803446, 'recall': 0.9073256840247131, 'f1-score': 0.8531120331950207, 'support': 1133.0} {'precision': 0.9280963603037444, 'recall': 0.9666721213112965, 'f1-score': 0.9469915571230095, 'support': 18333.0} {'precision': 0.9353938852934612, 'recall': 0.8495135792460479, 'f1-score': 0.8903876792352629, 'support': 9868.0} 0.9250 {'precision': 0.8895006639591835, 'recall': 0.9078371281940192, 'f1-score': 0.8968304231844311, 'support': 29334.0} {'precision': 0.9257972230878861, 'recall': 0.9249676143724006, 'f1-score': 0.9243239165827936, 'support': 29334.0}
No log 4.0 324 0.2390 {'precision': 0.8217179902755267, 'recall': 0.8949691085613416, 'f1-score': 0.8567807351077312, 'support': 1133.0} {'precision': 0.9432635621180161, 'recall': 0.9512900234549719, 'f1-score': 0.9472597903427299, 'support': 18333.0} {'precision': 0.9099989595255437, 'recall': 0.8862991487636805, 'f1-score': 0.8979927100980543, 'support': 9868.0} 0.9273 {'precision': 0.8916601706396955, 'recall': 0.910852760259998, 'f1-score': 0.9006777451828384, 'support': 29334.0} {'precision': 0.9273787107073643, 'recall': 0.9272516533715143, 'f1-score': 0.9271915992526735, 'support': 29334.0}
No log 5.0 405 0.2539 {'precision': 0.8431703204047217, 'recall': 0.8826125330979699, 'f1-score': 0.8624407072013798, 'support': 1133.0} {'precision': 0.9335059992600032, 'recall': 0.9633447880870561, 'f1-score': 0.948190701170407, 'support': 18333.0} {'precision': 0.9265359193845487, 'recall': 0.8665383056343737, 'f1-score': 0.8955333298423835, 'support': 9868.0} 0.9277 {'precision': 0.9010707463497579, 'recall': 0.9041652089397999, 'f1-score': 0.9020549127380568, 'support': 29334.0} {'precision': 0.9276721180179627, 'recall': 0.9276607349832958, 'f1-score': 0.9271646670996412, 'support': 29334.0}
No log 6.0 486 0.2930 {'precision': 0.841927303465765, 'recall': 0.8790820829655781, 'f1-score': 0.8601036269430052, 'support': 1133.0} {'precision': 0.9452679589509693, 'recall': 0.9495990836197021, 'f1-score': 0.9474285714285714, 'support': 18333.0} {'precision': 0.9045613314156564, 'recall': 0.8922780705310093, 'f1-score': 0.8983777165595348, 'support': 9868.0} 0.9276 {'precision': 0.8972521979441302, 'recall': 0.9069864123720964, 'f1-score': 0.9019699716437038, 'support': 29334.0} {'precision': 0.9275827485063247, 'recall': 0.9275925547146656, 'f1-score': 0.9275549436263691, 'support': 29334.0}
0.1621 7.0 567 0.3149 {'precision': 0.8406639004149378, 'recall': 0.8940864960282436, 'f1-score': 0.8665526090675792, 'support': 1133.0} {'precision': 0.9382959450098577, 'recall': 0.9605083728795069, 'f1-score': 0.9492722371967655, 'support': 18333.0} {'precision': 0.9227729117709891, 'recall': 0.8754560194568302, 'f1-score': 0.8984919396775871, 'support': 9868.0} 0.9293 {'precision': 0.9005775857319281, 'recall': 0.9100169627881934, 'f1-score': 0.9047722619806439, 'support': 29334.0} {'precision': 0.9293030221719495, 'recall': 0.9293311515647371, 'f1-score': 0.9289946986889036, 'support': 29334.0}
0.1621 8.0 648 0.3477 {'precision': 0.8284552845528456, 'recall': 0.8993821712268314, 'f1-score': 0.8624629707998307, 'support': 1133.0} {'precision': 0.9356556940449557, 'recall': 0.9581628756886489, 'f1-score': 0.9467755410030451, 'support': 18333.0} {'precision': 0.9191854233654877, 'recall': 0.8690717470612079, 'f1-score': 0.8934263985831857, 'support': 9868.0} 0.9259 {'precision': 0.8944321339877629, 'recall': 0.9088722646588961, 'f1-score': 0.9008883034620205, 'support': 29334.0} {'precision': 0.9259745494680297, 'recall': 0.9259221381332242, 'f1-score': 0.9255723133682386, 'support': 29334.0}
0.1621 9.0 729 0.3808 {'precision': 0.8464135021097047, 'recall': 0.8852603706972639, 'f1-score': 0.8654012079378774, 'support': 1133.0} {'precision': 0.9316216786166175, 'recall': 0.9638902525500463, 'f1-score': 0.9474813007694164, 'support': 18333.0} {'precision': 0.9268053588933667, 'recall': 0.8622821240372922, 'f1-score': 0.8933802299333298, 'support': 9868.0} 0.9267 {'precision': 0.9016135132065629, 'recall': 0.9038109157615342, 'f1-score': 0.9020875795468745, 'support': 29334.0} {'precision': 0.9267103706800465, 'recall': 0.9266721210881571, 'f1-score': 0.9261113508073029, 'support': 29334.0}
0.1621 10.0 810 0.4663 {'precision': 0.8380872483221476, 'recall': 0.881729920564872, 'f1-score': 0.8593548387096774, 'support': 1133.0} {'precision': 0.9158687080751703, 'recall': 0.9756177385043364, 'f1-score': 0.9447995351539802, 'support': 18333.0} {'precision': 0.9469406710786021, 'recall': 0.8265099310903932, 'f1-score': 0.8826362209837131, 'support': 9868.0} 0.9218 {'precision': 0.9002988758253068, 'recall': 0.8946191967198672, 'f1-score': 0.8955968649491236, 'support': 29334.0} {'precision': 0.9233171207368492, 'recall': 0.9218313220154087, 'f1-score': 0.9205874800198834, 'support': 29334.0}
0.1621 11.0 891 0.3998 {'precision': 0.8421052631578947, 'recall': 0.8755516328331863, 'f1-score': 0.8585028126352229, 'support': 1133.0} {'precision': 0.941814648890808, 'recall': 0.9517809414716631, 'f1-score': 0.9467715680954965, 'support': 18333.0} {'precision': 0.906636203136359, 'recall': 0.8846777462505067, 'f1-score': 0.8955223880597014, 'support': 9868.0} 0.9263 {'precision': 0.8968520383950206, 'recall': 0.9040034401851186, 'f1-score': 0.9002655895968069, 'support': 29334.0} {'precision': 0.9261293813943774, 'recall': 0.9262630394763756, 'f1-score': 0.9261219666592889, 'support': 29334.0}
0.1621 12.0 972 0.4524 {'precision': 0.8503401360544217, 'recall': 0.8826125330979699, 'f1-score': 0.8661758336942399, 'support': 1133.0} {'precision': 0.9383342231713828, 'recall': 0.9586537937053401, 'f1-score': 0.9483851819874267, 'support': 18333.0} {'precision': 0.9182223165040305, 'recall': 0.8772800972841508, 'f1-score': 0.8972844112769486, 'support': 9868.0} 0.9283 {'precision': 0.9022988919099451, 'recall': 0.906182141362487, 'f1-score': 0.9039484756528716, 'support': 29334.0} {'precision': 0.9281698543264606, 'recall': 0.9283425376695984, 'f1-score': 0.9280195449455239, 'support': 29334.0}
0.0212 13.0 1053 0.4537 {'precision': 0.8431703204047217, 'recall': 0.8826125330979699, 'f1-score': 0.8624407072013798, 'support': 1133.0} {'precision': 0.9365968111768783, 'recall': 0.9580537827960508, 'f1-score': 0.94720379658092, 'support': 18333.0} {'precision': 0.9167642362959021, 'recall': 0.8728212403729225, 'f1-score': 0.8942532315838654, 'support': 9868.0} 0.9265 {'precision': 0.8988437892925006, 'recall': 0.9044958520889811, 'f1-score': 0.9012992451220551, 'support': 29334.0} {'precision': 0.926316588126141, 'recall': 0.9264675802822663, 'f1-score': 0.9261172500595471, 'support': 29334.0}
0.0212 14.0 1134 0.4902 {'precision': 0.8573883161512027, 'recall': 0.880847308031774, 'f1-score': 0.8689595124074879, 'support': 1133.0} {'precision': 0.9300970873786408, 'recall': 0.9667266677575956, 'f1-score': 0.9480582004921365, 'support': 18333.0} {'precision': 0.9303346132748217, 'recall': 0.8593433319821646, 'f1-score': 0.8934309645472265, 'support': 9868.0} 0.9273 {'precision': 0.9059400056015551, 'recall': 0.9023057692571781, 'f1-score': 0.9034828924822836, 'support': 29334.0} {'precision': 0.9273686789700647, 'recall': 0.9272857435058294, 'f1-score': 0.9266264019680934, 'support': 29334.0}
0.0212 15.0 1215 0.4631 {'precision': 0.8514090520922288, 'recall': 0.8799646954986761, 'f1-score': 0.865451388888889, 'support': 1133.0} {'precision': 0.943136407819419, 'recall': 0.9526536846124475, 'f1-score': 0.9478711568207105, 'support': 18333.0} {'precision': 0.9084499740798341, 'recall': 0.8879205512768544, 'f1-score': 0.8980679546968688, 'support': 9868.0} 0.9281 {'precision': 0.9009984779971606, 'recall': 0.9068463104626593, 'f1-score': 0.9037968334688228, 'support': 29334.0} {'precision': 0.9279249527781314, 'recall': 0.9280698165950774, 'f1-score': 0.9279338964530544, 'support': 29334.0}
0.0212 16.0 1296 0.4685 {'precision': 0.8621291448516579, 'recall': 0.8720211827007943, 'f1-score': 0.8670469504168494, 'support': 1133.0} {'precision': 0.9403208556149732, 'recall': 0.9591447117220313, 'f1-score': 0.949639510706667, 'support': 18333.0} {'precision': 0.917685497470489, 'recall': 0.8823469801378192, 'f1-score': 0.8996693531721429, 'support': 9868.0} 0.9299 {'precision': 0.9067118326457067, 'recall': 0.9045042915202149, 'f1-score': 0.9054519380985532, 'support': 29334.0} {'precision': 0.9296862022276206, 'recall': 0.9299447739824095, 'f1-score': 0.9296394123443894, 'support': 29334.0}
0.0212 17.0 1377 0.5305 {'precision': 0.8462823725981621, 'recall': 0.8940864960282436, 'f1-score': 0.8695278969957082, 'support': 1133.0} {'precision': 0.9287246847035429, 'recall': 0.9680357824687722, 'f1-score': 0.9479728646973986, 'support': 18333.0} {'precision': 0.9344262295081968, 'recall': 0.8548844750709363, 'f1-score': 0.892887383573243, 'support': 9868.0} 0.9271 {'precision': 0.9031444289366339, 'recall': 0.9056689178559841, 'f1-score': 0.9034627150887832, 'support': 29334.0} {'precision': 0.927458430681484, 'recall': 0.9271152928342538, 'f1-score': 0.9264121612086422, 'support': 29334.0}
0.0212 18.0 1458 0.5198 {'precision': 0.847972972972973, 'recall': 0.8861429832303619, 'f1-score': 0.8666378938282262, 'support': 1133.0} {'precision': 0.9337531086300862, 'recall': 0.9625811378388698, 'f1-score': 0.9479480017189514, 'support': 18333.0} {'precision': 0.9244406010161064, 'recall': 0.866639643291447, 'f1-score': 0.8946074585490874, 'support': 9868.0} 0.9274 {'precision': 0.9020555608730553, 'recall': 0.9051212547868929, 'f1-score': 0.9030644513654217, 'support': 29334.0} {'precision': 0.9273071851680879, 'recall': 0.9273539237744597, 'f1-score': 0.9268636343554685, 'support': 29334.0}
0.0055 19.0 1539 0.5277 {'precision': 0.8447986577181208, 'recall': 0.8887908208296558, 'f1-score': 0.8662365591397849, 'support': 1133.0} {'precision': 0.9328933474128828, 'recall': 0.9637811596574484, 'f1-score': 0.9480857457140558, 'support': 18333.0} {'precision': 0.9267550532492936, 'recall': 0.8642075395216863, 'f1-score': 0.8943890928159414, 'support': 9868.0} 0.9274 {'precision': 0.9014823527934324, 'recall': 0.9055931733362635, 'f1-score': 0.9029037992232607, 'support': 29334.0} {'precision': 0.9274258363257326, 'recall': 0.9273880139087748, 'f1-score': 0.9268607610823233, 'support': 29334.0}
0.0055 20.0 1620 0.5306 {'precision': 0.8433835845896147, 'recall': 0.8887908208296558, 'f1-score': 0.8654920498495917, 'support': 1133.0} {'precision': 0.9324545214869496, 'recall': 0.9645993563519337, 'f1-score': 0.9482545981017748, 'support': 18333.0} {'precision': 0.9282833787465941, 'recall': 0.8630928252938792, 'f1-score': 0.8945019167148034, 'support': 9868.0} 0.9275 {'precision': 0.9013738282743861, 'recall': 0.9054943341584897, 'f1-score': 0.90274952155539, 'support': 29334.0} {'precision': 0.9276110562907095, 'recall': 0.9275243744460353, 'f1-score': 0.9269754876123645, 'support': 29334.0}

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
8
Safetensors
Model size
148M params
Tensor type
F32
·

Finetuned from

Evaluation results