kennethge123
/

sst5-bert-base-uncased-kd

Model card Files Files and versions Community

Edit model card

Plainly Optimized Network

Dataset: BIGBENCH

Trainer Hyperparameters:

lr = 5e-05
per_device_batch_size = 1
gradient_accumulation_steps = 4
weight_decay = 1e-09
seed = 42

eval_loss	eval_accuracy	epoch
66.323	0.063	1.0
59.935	0.055	2.0
60.344	0.056	3.0
58.559	0.054	4.0
56.373	0.051	5.0
58.011	0.053	6.0
64.814	0.059	7.0
54.974	0.048	8.0
59.489	0.055	9.0
55.248	0.049	10.0
51.685	0.044	11.0
54.073	0.048	12.0
57.350	0.051	13.0
54.031	0.048	14.0
53.526	0.048	15.0
53.041	0.047	16.0
55.731	0.050	17.0
52.224	0.045	18.0
52.757	0.046	19.0

Downloads last month: 20

Inference API

Unable to determine this model’s pipeline type. Check the docs .