metadata
language: en
license: mit
library_name: pytorch
Knowledge Continuity Regularized Network
Trainer Hyperparameters:
lr
= 5e-05per_device_batch_size
= 8gradient_accumulation_steps
= 2weight_decay
= 1e-09seed
= 42
Regularization Hyperparameters
numerical stability denominator constant
= 0.01lambda
= 0.1alpha
= 2.0beta
= 1.0
Extended Logs:
eval_loss | eval_accuracy | epoch |
---|---|---|
6.444 | 0.911 | 1.0 |
6.355 | 0.918 | 2.0 |
6.640 | 0.899 | 3.0 |
6.167 | 0.929 | 4.0 |
6.211 | 0.924 | 5.0 |
6.171 | 0.929 | 6.0 |
6.116 | 0.934 | 7.0 |
6.285 | 0.925 | 8.0 |
6.154 | 0.929 | 9.0 |
6.155 | 0.929 | 10.0 |
6.086 | 0.933 | 11.0 |
6.109 | 0.933 | 12.0 |
6.128 | 0.934 | 13.0 |
6.141 | 0.931 | 14.0 |
6.147 | 0.931 | 15.0 |
6.379 | 0.919 | 16.0 |
6.105 | 0.933 | 17.0 |
6.063 | 0.935 | 18.0 |
6.174 | 0.929 | 19.0 |
6.115 | 0.932 | 20.0 |
7.263 | 0.866 | 21.0 |
6.026 | 0.938 | 22.0 |
6.138 | 0.931 | 23.0 |
6.139 | 0.932 | 24.0 |
6.059 | 0.935 | 25.0 |
6.099 | 0.934 | 26.0 |
6.068 | 0.935 | 27.0 |
6.088 | 0.934 | 28.0 |
6.081 | 0.934 | 29.0 |
6.083 | 0.935 | 30.0 |
6.073 | 0.936 | 31.0 |
6.107 | 0.935 | 32.0 |
6.052 | 0.936 | 33.0 |
6.065 | 0.936 | 34.0 |
6.116 | 0.931 | 35.0 |
6.128 | 0.934 | 36.0 |
6.030 | 0.937 | 37.0 |
6.163 | 0.932 | 38.0 |
6.000 | 0.940 | 39.0 |
6.064 | 0.938 | 40.0 |
6.056 | 0.936 | 41.0 |
6.071 | 0.935 | 42.0 |
6.012 | 0.939 | 43.0 |
6.027 | 0.940 | 44.0 |
6.017 | 0.939 | 45.0 |
5.976 | 0.941 | 46.0 |
5.982 | 0.940 | 47.0 |
5.987 | 0.941 | 48.0 |
5.991 | 0.941 | 49.0 |