metadata
language: en
license: mit
library_name: pytorch
Knowledge Continuity Regularized Network
Trainer Hyperparameters:
lr
= 5e-05per_device_batch_size
= 8gradient_accumulation_steps
= 2weight_decay
= 1e-09seed
= 42
Regularization Hyperparameters
numerical stability denominator constant
= 0.001lambda
= 0.01alpha
= 2.0beta
= 2.0
Extended Logs:
eval_loss | eval_accuracy | epoch |
---|---|---|
14.625 | 0.792 | 0.67 |
13.757 | 0.792 | 2.0 |
13.403 | 0.792 | 2.67 |
13.166 | 0.792 | 4.0 |
13.001 | 0.792 | 4.67 |
12.786 | 0.792 | 6.0 |
12.670 | 0.792 | 6.67 |
12.480 | 0.792 | 8.0 |
12.372 | 0.792 | 8.67 |
12.122 | 0.833 | 10.0 |
12.015 | 0.833 | 10.67 |
11.880 | 0.833 | 12.0 |
11.840 | 0.833 | 12.67 |