metadata
language: en
license: mit
library_name: pytorch
Knowledge Continuity Regularized Network
Trainer Hyperparameters:
lr
= 1e-05per_device_batch_size
= 8gradient_accumulation_steps
= 2weight_decay
= 1e-09seed
= 42
Regularization Hyperparameters
numerical stability denominator constant
= 0.01lambda
= 0.001alpha
= 2.0beta
= 2.0
Extended Logs:
eval_loss | eval_accuracy | epoch |
---|---|---|
8.842 | 0.208 | 1.0 |
8.834 | 0.208 | 2.0 |
7.990 | 0.292 | 3.0 |
4.612 | 0.750 | 4.0 |
8.841 | 0.208 | 5.0 |
8.842 | 0.208 | 6.0 |
8.843 | 0.208 | 7.0 |
8.842 | 0.208 | 8.0 |
8.845 | 0.208 | 9.0 |
8.842 | 0.208 | 10.0 |
8.843 | 0.208 | 11.0 |
8.843 | 0.208 | 12.0 |
8.843 | 0.208 | 13.0 |
8.841 | 0.208 | 14.0 |
8.843 | 0.208 | 15.0 |
8.843 | 0.208 | 16.0 |
8.843 | 0.208 | 17.0 |
8.845 | 0.208 | 18.0 |
8.843 | 0.208 | 19.0 |