longformer-spans / meta_data /README_s42_e8.md
Theoreticallyhugo's picture
Training in progress, epoch 1
49c7a0d verified
|
raw
history blame
8.65 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: spans
split: train[80%:100%]
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.9362395452405953
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-spans
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1974
- B: {'precision': 0.8404351767905711, 'recall': 0.8887823585810163, 'f1-score': 0.863932898415657, 'support': 1043.0}
- I: {'precision': 0.9420745397395599, 'recall': 0.9673775216138328, 'f1-score': 0.954558380253654, 'support': 17350.0}
- O: {'precision': 0.9364367816091954, 'recall': 0.8830479080858443, 'f1-score': 0.9089590538882071, 'support': 9226.0}
- Accuracy: 0.9362
- Macro avg: {'precision': 0.9063154993797754, 'recall': 0.9130692627602311, 'f1-score': 0.9091501108525061, 'support': 27619.0}
- Weighted avg: {'precision': 0.9363529780585962, 'recall': 0.9362395452405953, 'f1-score': 0.9359037670307043, 'support': 27619.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 8
### Training results
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.2955 | {'precision': 0.7986798679867987, 'recall': 0.46404602109300097, 'f1-score': 0.5870224378411159, 'support': 1043.0} | {'precision': 0.8854450261780105, 'recall': 0.9747550432276657, 'f1-score': 0.9279561042524005, 'support': 17350.0} | {'precision': 0.9346644761784405, 'recall': 0.8016475178842402, 'f1-score': 0.8630608553591224, 'support': 9226.0} | 0.8976 | {'precision': 0.8729297901144165, 'recall': 0.7468161940683024, 'f1-score': 0.7926797991508797, 'support': 27619.0} | {'precision': 0.8986099700829504, 'recall': 0.8976429269705637, 'f1-score': 0.893403174010308, 'support': 27619.0} |
| No log | 2.0 | 82 | 0.2031 | {'precision': 0.784197111299915, 'recall': 0.8849472674976031, 'f1-score': 0.8315315315315316, 'support': 1043.0} | {'precision': 0.9307149161518093, 'recall': 0.9724495677233429, 'f1-score': 0.9511246406223575, 'support': 17350.0} | {'precision': 0.9504450324753428, 'recall': 0.8564925211359202, 'f1-score': 0.9010262257696694, 'support': 9226.0} | 0.9304 | {'precision': 0.8884523533090224, 'recall': 0.9046297854522888, 'f1-score': 0.8945607993078529, 'support': 27619.0} | {'precision': 0.9317725932125427, 'recall': 0.9304102248452153, 'f1-score': 0.9298731982018269, 'support': 27619.0} |
| No log | 3.0 | 123 | 0.1754 | {'precision': 0.8527204502814258, 'recall': 0.8715244487056567, 'f1-score': 0.8620199146514935, 'support': 1043.0} | {'precision': 0.9616262064931267, 'recall': 0.947492795389049, 'f1-score': 0.9545071853679779, 'support': 17350.0} | {'precision': 0.9036794248255445, 'recall': 0.9264036418816388, 'f1-score': 0.9149004495825305, 'support': 9226.0} | 0.9376 | {'precision': 0.906008693866699, 'recall': 0.9151402953254482, 'f1-score': 0.9104758498673339, 'support': 27619.0} | {'precision': 0.9381566488916958, 'recall': 0.9375792027227633, 'f1-score': 0.9377840611522629, 'support': 27619.0} |
| No log | 4.0 | 164 | 0.2248 | {'precision': 0.8219800181653043, 'recall': 0.8676893576222435, 'f1-score': 0.8442164179104478, 'support': 1043.0} | {'precision': 0.9191395059726502, 'recall': 0.9801152737752161, 'f1-score': 0.9486485732615547, 'support': 17350.0} | {'precision': 0.9589622053137083, 'recall': 0.8332972035551701, 'f1-score': 0.8917241779272748, 'support': 9226.0} | 0.9268 | {'precision': 0.9000272431505542, 'recall': 0.8937006116508766, 'f1-score': 0.8948630563664257, 'support': 27619.0} | {'precision': 0.9287729785218931, 'recall': 0.9268257359064412, 'f1-score': 0.9256894795439954, 'support': 27619.0} |
| No log | 5.0 | 205 | 0.1931 | {'precision': 0.848987108655617, 'recall': 0.8839884947267498, 'f1-score': 0.8661343353687178, 'support': 1043.0} | {'precision': 0.9373124374791597, 'recall': 0.9721037463976945, 'f1-score': 0.9543911272068809, 'support': 17350.0} | {'precision': 0.9444899871179295, 'recall': 0.8741599826577064, 'f1-score': 0.9079650999155643, 'support': 9226.0} | 0.9361 | {'precision': 0.910263177750902, 'recall': 0.9100840745940503, 'f1-score': 0.909496854163721, 'support': 27619.0} | {'precision': 0.9363745597502171, 'recall': 0.9360585104457076, 'f1-score': 0.935549809212859, 'support': 27619.0} |
| No log | 6.0 | 246 | 0.1742 | {'precision': 0.8382222222222222, 'recall': 0.9041227229146692, 'f1-score': 0.8699261992619925, 'support': 1043.0} | {'precision': 0.9481431159420289, 'recall': 0.9653025936599423, 'f1-score': 0.956645913063346, 'support': 17350.0} | {'precision': 0.9353340883352208, 'recall': 0.8951875135486668, 'f1-score': 0.9148205582631811, 'support': 9226.0} | 0.9396 | {'precision': 0.9072331421664908, 'recall': 0.9215376100410927, 'f1-score': 0.9137975568628399, 'support': 27619.0} | {'precision': 0.9397132821011885, 'recall': 0.9395705854665267, 'f1-score': 0.9393994745651696, 'support': 27619.0} |
| No log | 7.0 | 287 | 0.1985 | {'precision': 0.8421052631578947, 'recall': 0.8897411313518696, 'f1-score': 0.8652680652680652, 'support': 1043.0} | {'precision': 0.9399821009061416, 'recall': 0.9685878962536023, 'f1-score': 0.9540706256386964, 'support': 17350.0} | {'precision': 0.9385345526102559, 'recall': 0.8788207240407544, 'f1-score': 0.9076966134900645, 'support': 9226.0} | 0.9356 | {'precision': 0.9068739722247642, 'recall': 0.9123832505487423, 'f1-score': 0.9090117681322755, 'support': 27619.0} | {'precision': 0.935802347028403, 'recall': 0.9356240269379775, 'f1-score': 0.9352260727385245, 'support': 27619.0} |
| No log | 8.0 | 328 | 0.1974 | {'precision': 0.8404351767905711, 'recall': 0.8887823585810163, 'f1-score': 0.863932898415657, 'support': 1043.0} | {'precision': 0.9420745397395599, 'recall': 0.9673775216138328, 'f1-score': 0.954558380253654, 'support': 17350.0} | {'precision': 0.9364367816091954, 'recall': 0.8830479080858443, 'f1-score': 0.9089590538882071, 'support': 9226.0} | 0.9362 | {'precision': 0.9063154993797754, 'recall': 0.9130692627602311, 'f1-score': 0.9091501108525061, 'support': 27619.0} | {'precision': 0.9363529780585962, 'recall': 0.9362395452405953, 'f1-score': 0.9359037670307043, 'support': 27619.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2