--- language: en license: mit library_name: pytorch --- # Plainly Optimized Network Dataset: BIGBENCH Trainer Hyperparameters: - `lr` = 5e-05 - `per_device_batch_size` = 1 - `gradient_accumulation_steps` = 4 - `weight_decay` = 1e-09 - `seed` = 42 |eval_loss|eval_accuracy|epoch| |--|--|--| |66.323|0.063|1.0| |59.935|0.055|2.0| |60.344|0.056|3.0| |58.559|0.054|4.0| |56.373|0.051|5.0| |58.011|0.053|6.0|