Hyperparameters:
- learning rate: 2e-5
- weight decay: 0.01
- per_device_train_batch_size: 8
- per_device_eval_batch_size: 8
- gradient_accumulation_steps:1
- eval steps: 50000
- max_length: 512
- num_epochs: 1
- hidden_dropout_prob: 0.3
- attention_probs_dropout_prob: 0.25
Dataset version:
- tasky_or_not/10xp3nirstbbflanseuni_10xc4
Checkpoint:
- 300000 steps.
Results on Validation set:
Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|---|---|
50000 | 0.020800 | 0.192550 | 0.970363 | 0.990686 | 0.949654 | 0.969736 |
100000 | 0.015200 | 0.264168 | 0.969427 | 0.994374 | 0.944196 | 0.968636 |
150000 | 0.012900 | 0.146541 | 0.981440 | 0.994599 | 0.968138 | 0.981190 |
200000 | 0.011100 | 0.319310 | 0.970516 | 0.998871 | 0.942097 | 0.969654 |
250000 | 0.008000 | 0.204103 | 0.976309 | 0.996226 | 0.956241 | 0.975824 |
300000 | 0.006100 | 0.096262 | 0.988053 | 0.994676 | 0.981358 | 0.987972 |
350000 | 0.005800 | 0.162989 | 0.983663 | 0.994730 | 0.972478 | 0.983478 |
Wandb logs:
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.