asun17904
/

anliR3-t5-base-kd

Model card Files Files and versions Community

Edit model card

Knowledge Continuity Regularized Network

Dataset: ANLI Round: None

Trainer Hyperparameters:

lr = 5e-05
per_device_batch_size = 8
gradient_accumulation_steps = 2
weight_decay = 0.0
seed = 42

Regularization Hyperparameters

numerical stability denominator constant = 0.01
lambda = 0.0001
alpha = 2.0
beta = 2.0

Extended Logs:

eval_loss	eval_accuracy	epoch
35.403	0.419	1.0
35.321	0.427	2.0
35.356	0.426	3.0
35.000	0.443	4.0
34.783	0.447	5.0
34.693	0.453	6.0
34.950	0.443	7.0
35.001	0.443	8.0
34.699	0.453	9.0
35.112	0.442	10.0
34.913	0.448	11.0
34.830	0.452	12.0
35.178	0.437	13.0
35.007	0.443	14.0

Test Accuracy: 0.443

Downloads last month: 0

Unable to determine this model’s pipeline type. Check the docs .