Edit model card

Knowledge Continuity Regularized Network

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 8
  • gradient_accumulation_steps = 2
  • weight_decay = 1e-09
  • seed = 42

Regularization Hyperparameters

  • numerical stability denominator constant = 0.01
  • lambda = 0.01
  • alpha = 1.0
  • beta = 2.0

Extended Logs:

eval_loss eval_accuracy epoch
12.986 0.913 1.0
12.382 0.932 2.0
12.417 0.931 3.0
12.339 0.935 4.0
13.123 0.910 5.0
12.301 0.934 6.0
12.166 0.938 7.0
12.499 0.928 8.0
12.215 0.938 9.0
12.582 0.925 10.0
12.306 0.934 11.0
12.292 0.934 12.0
12.502 0.928 13.0
12.200 0.937 14.0
12.196 0.937 15.0
12.094 0.939 16.0
12.245 0.935 17.0
12.210 0.935 18.0
12.049 0.940 19.0
12.357 0.933 20.0
12.329 0.933 21.0
12.158 0.938 22.0
12.118 0.939 23.0
12.134 0.939 24.0
12.273 0.935 25.0
12.113 0.940 26.0
12.290 0.935 27.0
12.088 0.939 28.0
12.250 0.934 29.0
12.163 0.936 30.0
12.086 0.940 31.0
12.115 0.939 32.0
12.054 0.939 33.0
12.079 0.939 34.0
12.008 0.941 35.0
12.081 0.940 36.0
12.031 0.941 37.0
12.019 0.941 38.0
12.114 0.939 39.0
12.067 0.941 40.0
12.056 0.940 41.0
12.050 0.941 42.0
12.032 0.941 43.0
11.981 0.943 44.0
12.050 0.941 45.0
11.985 0.942 46.0
12.006 0.942 47.0
11.996 0.942 48.0
12.014 0.942 49.0
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .