Knowledge Continuity Regularized Network
Dataset: ANLI Round: None
Trainer Hyperparameters:
lr
= 5e-05per_device_batch_size
= 8gradient_accumulation_steps
= 2weight_decay
= 1e-09seed
= 42
Regularization Hyperparameters
numerical stability denominator constant
= 0.01lambda
= 0.01alpha
= 2.0beta
= 2.0
Extended Logs:
eval_loss | eval_accuracy | epoch |
---|---|---|
22.099 | 0.350 | 1.0 |
22.046 | 0.400 | 2.0 |
21.908 | 0.400 | 3.0 |
21.955 | 0.400 | 4.0 |
22.005 | 0.400 | 5.0 |
22.104 | 0.400 | 6.0 |
22.253 | 0.400 | 7.0 |
22.441 | 0.400 | 8.0 |
22.574 | 0.400 | 9.0 |
22.605 | 0.400 | 10.0 |
22.551 | 0.400 | 11.0 |
22.509 | 0.400 | 12.0 |
22.607 | 0.400 | 13.0 |
22.663 | 0.400 | 14.0 |
22.797 | 0.400 | 15.0 |
22.860 | 0.400 | 16.0 |
22.913 | 0.400 | 17.0 |
22.923 | 0.400 | 18.0 |
22.924 | 0.400 | 19.0 |
22.915 | 0.400 | 20.0 |
22.938 | 0.400 | 21.0 |
23.008 | 0.400 | 22.0 |
23.005 | 0.400 | 23.0 |
23.025 | 0.400 | 24.0 |
23.082 | 0.400 | 25.0 |
23.077 | 0.400 | 26.0 |
23.082 | 0.400 | 27.0 |
23.042 | 0.400 | 28.0 |
23.036 | 0.400 | 29.0 |
23.062 | 0.400 | 30.0 |
23.071 | 0.400 | 31.0 |
23.068 | 0.400 | 32.0 |
23.080 | 0.400 | 33.0 |
23.127 | 0.400 | 34.0 |
23.276 | 0.400 | 35.0 |
23.254 | 0.400 | 36.0 |
23.235 | 0.400 | 37.0 |
23.298 | 0.400 | 38.0 |
23.186 | 0.400 | 39.0 |
23.164 | 0.400 | 40.0 |
23.157 | 0.400 | 41.0 |
23.215 | 0.400 | 42.0 |
23.208 | 0.400 | 43.0 |
23.219 | 0.400 | 44.0 |
23.199 | 0.400 | 45.0 |
23.186 | 0.400 | 46.0 |
23.149 | 0.400 | 47.0 |
23.252 | 0.400 | 48.0 |
23.162 | 0.400 | 49.0 |
Unable to determine this model’s pipeline type. Check the
docs
.