Edit model card

dit-tiny_tobacco3482_simkd_CEKD_t1_aNone

This model is a fine-tuned version of microsoft/dit-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9983
  • Accuracy: 0.18
  • Brier Loss: 0.8965
  • Nll: 6.7849
  • F1 Micro: 0.18
  • F1 Macro: 0.0305
  • Ece: 0.2195
  • Aurc: 0.8182

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Accuracy Brier Loss Nll F1 Micro F1 Macro Ece Aurc
No log 0.96 12 1.0062 0.18 0.8980 6.1518 0.18 0.0309 0.2213 0.7838
No log 1.96 24 1.0034 0.18 0.8987 5.7795 0.18 0.0305 0.2273 0.8165
No log 2.96 36 1.0025 0.18 0.8984 6.4819 0.18 0.0305 0.2249 0.8306
No log 3.96 48 1.0018 0.18 0.8982 6.8521 0.18 0.0306 0.2205 0.8505
No log 4.96 60 1.0015 0.16 0.8980 6.6853 0.16 0.0324 0.2089 0.8798
No log 5.96 72 1.0011 0.175 0.8979 6.8349 0.175 0.0314 0.2134 0.8345
No log 6.96 84 1.0008 0.18 0.8976 6.8293 0.18 0.0313 0.2249 0.8208
No log 7.96 96 1.0005 0.18 0.8975 6.9400 0.18 0.0305 0.2230 0.8140
No log 8.96 108 1.0003 0.18 0.8974 6.5877 0.18 0.0306 0.2230 0.8246
No log 9.96 120 1.0000 0.18 0.8973 6.5454 0.18 0.0306 0.2188 0.8188
No log 10.96 132 0.9998 0.18 0.8972 6.5555 0.18 0.0306 0.2274 0.8151
No log 11.96 144 0.9996 0.18 0.8971 6.5819 0.18 0.0306 0.2254 0.8131
No log 12.96 156 0.9994 0.18 0.8970 6.7150 0.18 0.0305 0.2255 0.8162
No log 13.96 168 0.9993 0.18 0.8969 6.6542 0.18 0.0305 0.2213 0.8220
No log 14.96 180 0.9991 0.18 0.8968 6.6025 0.18 0.0305 0.2213 0.8125
No log 15.96 192 0.9990 0.18 0.8968 7.0424 0.18 0.0305 0.2301 0.8201
No log 16.96 204 0.9988 0.18 0.8967 6.6676 0.18 0.0305 0.2258 0.8153
No log 17.96 216 0.9987 0.18 0.8967 6.6621 0.18 0.0305 0.2270 0.8145
No log 18.96 228 0.9986 0.18 0.8967 7.0058 0.18 0.0305 0.2259 0.8214
No log 19.96 240 0.9985 0.18 0.8966 6.8777 0.18 0.0305 0.2194 0.8183
No log 20.96 252 0.9984 0.18 0.8966 6.7612 0.18 0.0305 0.2282 0.8131
No log 21.96 264 0.9984 0.18 0.8966 6.7811 0.18 0.0305 0.2282 0.8145
No log 22.96 276 0.9983 0.18 0.8965 6.7044 0.18 0.0305 0.2239 0.8167
No log 23.96 288 0.9983 0.18 0.8965 6.7813 0.18 0.0305 0.2217 0.8183
No log 24.96 300 0.9983 0.18 0.8965 6.7849 0.18 0.0305 0.2195 0.8182

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1.post200
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
13