Edit model card

Knowledge Continuity Regularized Network

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 8
  • gradient_accumulation_steps = 2
  • weight_decay = 1e-09
  • seed = 42

Regularization Hyperparameters

  • numerical stability denominator constant = 0.01
  • lambda = 0.1
  • alpha = 2.0
  • beta = 1.0

Extended Logs:

eval_loss eval_accuracy epoch
6.444 0.911 1.0
6.355 0.918 2.0
6.640 0.899 3.0
6.167 0.929 4.0
6.211 0.924 5.0
6.171 0.929 6.0
6.116 0.934 7.0
6.285 0.925 8.0
6.154 0.929 9.0
6.155 0.929 10.0
6.086 0.933 11.0
6.109 0.933 12.0
6.128 0.934 13.0
6.141 0.931 14.0
6.147 0.931 15.0
6.379 0.919 16.0
6.105 0.933 17.0
6.063 0.935 18.0
6.174 0.929 19.0
6.115 0.932 20.0
7.263 0.866 21.0
6.026 0.938 22.0
6.138 0.931 23.0
6.139 0.932 24.0
6.059 0.935 25.0
6.099 0.934 26.0
6.068 0.935 27.0
6.088 0.934 28.0
6.081 0.934 29.0
6.083 0.935 30.0
6.073 0.936 31.0
6.107 0.935 32.0
6.052 0.936 33.0
6.065 0.936 34.0
6.116 0.931 35.0
6.128 0.934 36.0
6.030 0.937 37.0
6.163 0.932 38.0
6.000 0.940 39.0
6.064 0.938 40.0
6.056 0.936 41.0
6.071 0.935 42.0
6.012 0.939 43.0
6.027 0.940 44.0
6.017 0.939 45.0
5.976 0.941 46.0
5.982 0.940 47.0
5.987 0.941 48.0
5.991 0.941 49.0
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .