Edit model card

Knowledge Continuity Regularized Network

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 8
  • gradient_accumulation_steps = 2
  • weight_decay = 1e-09
  • seed = 42

Regularization Hyperparameters

  • numerical stability denominator constant = 0.01
  • lambda = 0.001
  • alpha = 2.0
  • beta = 1.0

Extended Logs:

eval_loss eval_accuracy epoch
12.460 0.921 1.0
12.355 0.924 2.0
12.275 0.928 3.0
12.118 0.933 4.0
12.028 0.936 5.0
11.984 0.938 6.0
12.000 0.937 7.0
11.973 0.938 8.0
11.883 0.941 9.0
12.051 0.935 10.0
11.958 0.939 11.0
12.281 0.928 12.0
12.284 0.929 13.0
11.990 0.938 14.0
12.207 0.931 15.0
11.940 0.940 16.0
12.162 0.932 17.0
11.981 0.938 18.0
11.941 0.940 19.0
11.961 0.939 20.0
11.979 0.938 21.0
11.854 0.943 22.0
11.867 0.942 23.0
11.889 0.941 24.0
11.922 0.940 25.0
11.985 0.939 26.0
11.880 0.941 27.0
11.893 0.941 28.0
11.974 0.939 29.0
11.792 0.944 30.0
12.016 0.937 31.0
11.867 0.942 32.0
11.879 0.942 33.0
11.830 0.943 34.0
11.905 0.940 35.0
11.799 0.944 36.0
11.894 0.941 37.0
11.853 0.942 38.0
11.800 0.944 39.0
11.784 0.944 40.0
11.774 0.945 41.0
11.746 0.946 42.0
11.748 0.946 43.0
11.770 0.945 44.0
11.788 0.944 45.0
11.777 0.945 46.0
11.724 0.947 47.0
11.744 0.946 48.0
11.743 0.946 49.0
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .