language: en | |
license: mit | |
library_name: pytorch | |
# Knowledge Continuity Regularized Network | |
Trainer Hyperparameters: | |
- `lr` = 5e-05 | |
- `per_device_batch_size` = 8 | |
- `gradient_accumulation_steps` = 2 | |
- `weight_decay` = 1e-09 | |
- `seed` = 42 | |
Regularization Hyperparameters | |
- `numerical stability denominator constant` = 0.001 | |
- `lambda` = 0.01 | |
- `alpha` = 2.0 | |
- `beta` = 2.0 | |
Extended Logs: | |
|eval_loss|eval_accuracy|epoch| | |
|--|--|--| | |
|14.625|0.792|0.67| | |
|13.757|0.792|2.0| | |
|13.403|0.792|2.67| | |
|13.166|0.792|4.0| | |
|13.001|0.792|4.67| | |
|12.786|0.792|6.0| | |
|12.670|0.792|6.67| | |
|12.480|0.792|8.0| | |
|12.372|0.792|8.67| | |
|12.122|0.833|10.0| | |
|12.015|0.833|10.67| | |
|11.880|0.833|12.0| | |
|11.840|0.833|12.67| | |