chessgpt-small-l
This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8545
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 1
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.7778 | 0.032 | 500 | 1.6672 |
1.5449 | 0.064 | 1000 | 1.4564 |
1.4074 | 0.096 | 1500 | 1.3260 |
1.3303 | 0.128 | 2000 | 1.2546 |
1.2713 | 0.16 | 2500 | 1.1983 |
1.2207 | 0.192 | 3000 | 1.1559 |
1.1735 | 0.224 | 3500 | 1.1085 |
1.1286 | 0.256 | 4000 | 1.0697 |
1.0956 | 0.288 | 4500 | 1.0391 |
1.0691 | 0.32 | 5000 | 1.0118 |
1.0498 | 0.352 | 5500 | 0.9915 |
1.0277 | 0.384 | 6000 | 0.9749 |
1.011 | 0.416 | 6500 | 0.9611 |
0.9998 | 0.448 | 7000 | 0.9477 |
0.9867 | 0.48 | 7500 | 0.9374 |
0.976 | 0.512 | 8000 | 0.9271 |
0.9693 | 0.544 | 8500 | 0.9196 |
0.9597 | 0.576 | 9000 | 0.9101 |
0.9535 | 0.608 | 9500 | 0.9036 |
0.9447 | 0.64 | 10000 | 0.8974 |
0.94 | 0.672 | 10500 | 0.8913 |
0.9323 | 0.704 | 11000 | 0.8857 |
0.9272 | 0.736 | 11500 | 0.8809 |
0.9224 | 0.768 | 12000 | 0.8753 |
0.9183 | 0.8 | 12500 | 0.8717 |
0.9127 | 0.832 | 13000 | 0.8669 |
0.9082 | 0.864 | 13500 | 0.8639 |
0.9055 | 0.896 | 14000 | 0.8609 |
0.9035 | 0.928 | 14500 | 0.8580 |
0.9003 | 0.96 | 15000 | 0.8558 |
0.8975 | 0.992 | 15500 | 0.8547 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 12
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for dakwi/chessgpt-small-l
Base model
openai-community/gpt2