baseline_seed-42_1e-3
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.0426
- Accuracy: 0.4215
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 32000
- num_epochs: 20.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
6.4165 | 0.9996 | 1977 | 4.6002 | 0.2628 |
4.4252 | 1.9996 | 3955 | 3.9054 | 0.3265 |
3.8303 | 2.9997 | 5933 | 3.5867 | 0.3593 |
3.552 | 3.9997 | 7911 | 3.4361 | 0.3746 |
3.4003 | 4.9998 | 9889 | 3.3533 | 0.3826 |
3.3048 | 5.9999 | 11867 | 3.2995 | 0.3884 |
3.2405 | 6.9999 | 13845 | 3.2618 | 0.3921 |
3.1938 | 8.0 | 15823 | 3.2395 | 0.3949 |
3.1584 | 8.9996 | 17800 | 3.2179 | 0.3974 |
3.1331 | 9.9996 | 19778 | 3.1996 | 0.3994 |
3.1128 | 10.9997 | 21756 | 3.1903 | 0.4005 |
3.0941 | 11.9997 | 23734 | 3.1839 | 0.4014 |
3.0833 | 12.9998 | 25712 | 3.1728 | 0.4024 |
3.0736 | 13.9999 | 27690 | 3.1701 | 0.4029 |
3.0665 | 14.9999 | 29668 | 3.1649 | 0.4034 |
3.0616 | 16.0 | 31646 | 3.1627 | 0.4037 |
3.0446 | 16.9996 | 33623 | 3.1264 | 0.4081 |
2.9699 | 17.9996 | 35601 | 3.0900 | 0.4133 |
2.8822 | 18.9997 | 37579 | 3.0569 | 0.4182 |
2.7774 | 19.9912 | 39540 | 3.0426 | 0.4215 |
Framework versions
- Transformers 4.45.1
- Pytorch 2.4.1+cu121
- Datasets 2.19.1
- Tokenizers 0.20.0
- Downloads last month
- 12
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.