great-books-bot-4
This model is a fine-tuned version of erikanesse/great-books-bot on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.9485
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: tpu
- gradient_accumulation_steps: 5
- total_train_batch_size: 5
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 2000
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
3.8623 | 0.01 | 20 | 4.0416 |
3.8944 | 0.02 | 40 | 4.0495 |
4.0613 | 0.03 | 60 | 4.0321 |
3.9456 | 0.04 | 80 | 4.0543 |
3.9298 | 0.05 | 100 | 4.0518 |
3.951 | 0.06 | 120 | 4.0281 |
3.8512 | 0.07 | 140 | 4.0309 |
4.0738 | 0.08 | 160 | 4.0358 |
3.9713 | 0.09 | 180 | 4.0130 |
3.9149 | 0.1 | 200 | 4.0469 |
3.9674 | 0.11 | 220 | 4.0395 |
4.1288 | 0.12 | 240 | 4.0422 |
3.9388 | 0.13 | 260 | 4.0552 |
4.0308 | 0.14 | 280 | 4.0643 |
3.9006 | 0.15 | 300 | 4.0624 |
3.9531 | 0.16 | 320 | 4.0482 |
3.985 | 0.17 | 340 | 4.0538 |
3.8275 | 0.18 | 360 | 4.0420 |
3.9002 | 0.19 | 380 | 4.0525 |
3.8415 | 0.2 | 400 | 4.0395 |
3.888 | 0.21 | 420 | 4.0422 |
3.8562 | 0.22 | 440 | 4.0171 |
3.9306 | 0.23 | 460 | 4.0280 |
3.8681 | 0.24 | 480 | 4.0340 |
3.8337 | 0.25 | 500 | 4.0101 |
3.853 | 0.26 | 520 | 4.0334 |
3.9567 | 0.27 | 540 | 4.0421 |
3.9489 | 0.28 | 560 | 4.0390 |
3.9693 | 0.29 | 580 | 4.0221 |
4.0172 | 0.3 | 600 | 4.0040 |
3.8797 | 0.31 | 620 | 4.0225 |
3.8353 | 0.32 | 640 | 4.0177 |
3.7483 | 0.33 | 660 | 4.0173 |
3.8179 | 0.34 | 680 | 4.0037 |
3.8957 | 0.35 | 700 | 4.0079 |
3.9634 | 0.36 | 720 | 4.0300 |
3.8911 | 0.37 | 740 | 4.0261 |
3.9296 | 0.38 | 760 | 4.0224 |
3.9195 | 0.39 | 780 | 4.0133 |
3.8455 | 0.4 | 800 | 4.0061 |
3.7869 | 0.41 | 820 | 4.0191 |
3.9417 | 0.42 | 840 | 4.0198 |
3.8437 | 0.43 | 860 | 4.0213 |
3.9039 | 0.44 | 880 | 3.9977 |
3.9966 | 0.45 | 900 | 4.0057 |
3.8641 | 0.46 | 920 | 3.9954 |
3.9355 | 0.47 | 940 | 4.0063 |
3.9496 | 0.48 | 960 | 3.9946 |
3.9551 | 0.49 | 980 | 4.0031 |
3.9199 | 0.5 | 1000 | 4.0071 |
3.9086 | 0.51 | 1020 | 3.9915 |
3.8584 | 0.52 | 1040 | 4.0360 |
3.8317 | 0.53 | 1060 | 4.0207 |
3.7805 | 0.54 | 1080 | 3.9918 |
3.7688 | 0.55 | 1100 | 3.9961 |
3.7708 | 0.56 | 1120 | 3.9960 |
3.8142 | 0.57 | 1140 | 3.9882 |
3.8483 | 0.58 | 1160 | 3.9935 |
3.9744 | 0.59 | 1180 | 3.9856 |
3.7912 | 0.6 | 1200 | 3.9911 |
3.8575 | 0.61 | 1220 | 3.9993 |
3.9246 | 0.62 | 1240 | 3.9908 |
3.786 | 0.63 | 1260 | 3.9947 |
3.8222 | 0.64 | 1280 | 3.9734 |
3.8651 | 0.65 | 1300 | 3.9748 |
3.8582 | 0.66 | 1320 | 3.9598 |
3.8398 | 0.67 | 1340 | 3.9538 |
3.9176 | 0.68 | 1360 | 3.9563 |
3.8013 | 0.69 | 1380 | 3.9660 |
3.9405 | 0.7 | 1400 | 3.9716 |
3.9182 | 0.71 | 1420 | 3.9725 |
3.8787 | 0.72 | 1440 | 3.9654 |
3.8406 | 0.73 | 1460 | 3.9525 |
3.8775 | 0.74 | 1480 | 3.9601 |
3.845 | 0.75 | 1500 | 3.9602 |
3.7902 | 0.76 | 1520 | 3.9585 |
3.7236 | 0.77 | 1540 | 3.9643 |
3.7734 | 0.78 | 1560 | 3.9600 |
3.7393 | 0.79 | 1580 | 3.9605 |
3.8666 | 0.8 | 1600 | 3.9557 |
3.9247 | 0.81 | 1620 | 3.9596 |
3.7902 | 0.82 | 1640 | 3.9545 |
3.7469 | 0.83 | 1660 | 3.9552 |
3.7093 | 0.84 | 1680 | 3.9573 |
3.8674 | 0.85 | 1700 | 3.9568 |
3.8095 | 0.86 | 1720 | 3.9546 |
3.899 | 0.87 | 1740 | 3.9537 |
3.806 | 0.88 | 1760 | 3.9541 |
3.8183 | 0.89 | 1780 | 3.9610 |
3.7686 | 0.9 | 1800 | 3.9537 |
3.7624 | 0.91 | 1820 | 3.9535 |
3.8613 | 0.92 | 1840 | 3.9497 |
3.6526 | 0.93 | 1860 | 3.9491 |
3.7679 | 0.94 | 1880 | 3.9518 |
3.7848 | 0.95 | 1900 | 3.9523 |
3.7083 | 0.96 | 1920 | 3.9492 |
3.8315 | 0.97 | 1940 | 3.9484 |
3.7784 | 0.98 | 1960 | 3.9485 |
3.8287 | 0.99 | 1980 | 3.9484 |
3.8246 | 1.0 | 2000 | 3.9485 |
Framework versions
- Transformers 4.21.1
- Pytorch 1.9.0+cu102
- Datasets 2.4.0
- Tokenizers 0.12.1
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.