tarekxpc's picture
Training complete
399e03c verified
|
raw
history blame
10.3 kB
metadata
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: mamba_text_classification
    results: []

mamba_text_classification

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2292
  • 1: {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1}
  • 4: {'precision': 0.6666666666666666, 'recall': 1.0, 'f1-score': 0.8, 'support': 2}
  • 5: {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1}
  • 6: {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3}
  • 9: {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2}
  • 10: {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2}
  • Accuracy: 0.9091
  • Macro avg: {'precision': 0.7777777777777777, 'recall': 0.8333333333333334, 'f1-score': 0.7999999999999999, 'support': 11}
  • Weighted avg: {'precision': 0.8484848484848484, 'recall': 0.9090909090909091, 'f1-score': 0.8727272727272727, 'support': 11}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss 0 1 4 5 6 9 10 Accuracy Macro avg Weighted avg
1.0038 0.4 459 0.7923 {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 0} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 0.6666666666666666, 'recall': 1.0, 'f1-score': 0.8, 'support': 2} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1} {'precision': 1.0, 'recall': 0.6666666666666666, 'f1-score': 0.8, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 0.8182 {'precision': 0.6666666666666666, 'recall': 0.6666666666666666, 'f1-score': 0.6571428571428571, 'support': 11} {'precision': 0.8484848484848484, 'recall': 0.8181818181818182, 'f1-score': 0.8181818181818182, 'support': 11}
1.0341 0.8 918 0.0965 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 1.0 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 11} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 11}
0.0006 1.2 1377 0.1084 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 0.6666666666666666, 'recall': 1.0, 'f1-score': 0.8, 'support': 2} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 0.9091 {'precision': 0.7777777777777777, 'recall': 0.8333333333333334, 'f1-score': 0.7999999999999999, 'support': 11} {'precision': 0.8484848484848484, 'recall': 0.9090909090909091, 'f1-score': 0.8727272727272727, 'support': 11}
0.1193 1.6 1836 0.7853 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 0.6666666666666666, 'recall': 1.0, 'f1-score': 0.8, 'support': 2} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 0.9091 {'precision': 0.7777777777777777, 'recall': 0.8333333333333334, 'f1-score': 0.7999999999999999, 'support': 11} {'precision': 0.8484848484848484, 'recall': 0.9090909090909091, 'f1-score': 0.8727272727272727, 'support': 11}
0.007 2.0 2295 0.0076 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 1.0 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 11} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 11}
0.0001 2.4 2754 0.3204 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 0.6666666666666666, 'recall': 1.0, 'f1-score': 0.8, 'support': 2} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 0.9091 {'precision': 0.7777777777777777, 'recall': 0.8333333333333334, 'f1-score': 0.7999999999999999, 'support': 11} {'precision': 0.8484848484848484, 'recall': 0.9090909090909091, 'f1-score': 0.8727272727272727, 'support': 11}
0.0001 2.8 3213 0.0948 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 0.6666666666666666, 'recall': 1.0, 'f1-score': 0.8, 'support': 2} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 0.9091 {'precision': 0.7777777777777777, 'recall': 0.8333333333333334, 'f1-score': 0.7999999999999999, 'support': 11} {'precision': 0.8484848484848484, 'recall': 0.9090909090909091, 'f1-score': 0.8727272727272727, 'support': 11}
0.0001 3.2 3672 0.1412 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 0.6666666666666666, 'recall': 1.0, 'f1-score': 0.8, 'support': 2} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 0.9091 {'precision': 0.7777777777777777, 'recall': 0.8333333333333334, 'f1-score': 0.7999999999999999, 'support': 11} {'precision': 0.8484848484848484, 'recall': 0.9090909090909091, 'f1-score': 0.8727272727272727, 'support': 11}
0.0 3.6 4131 0.2292 {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 1} {'precision': 0.6666666666666666, 'recall': 1.0, 'f1-score': 0.8, 'support': 2} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 3} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2} 0.9091 {'precision': 0.7777777777777777, 'recall': 0.8333333333333334, 'f1-score': 0.7999999999999999, 'support': 11} {'precision': 0.8484848484848484, 'recall': 0.9090909090909091, 'f1-score': 0.8727272727272727, 'support': 11}

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.15.2