falcon7b-chunks-10k-v1_1
This model is a fine-tuned version of tiiuae/falcon-7b on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.6306
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.03
- training_steps: 1500
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.9498 | 0.57 | 10 | 1.8499 |
1.8859 | 1.14 | 20 | 1.8045 |
1.7494 | 1.71 | 30 | 1.7604 |
1.6834 | 2.29 | 40 | 1.7328 |
1.4871 | 2.86 | 50 | 1.7014 |
1.3791 | 3.43 | 60 | 1.7721 |
1.315 | 4.0 | 70 | 1.7826 |
1.116 | 4.57 | 80 | 1.8231 |
1.0471 | 5.14 | 90 | 1.9451 |
0.871 | 5.71 | 100 | 1.8934 |
0.8318 | 6.29 | 110 | 1.9911 |
0.7367 | 6.86 | 120 | 2.0047 |
0.6208 | 7.43 | 130 | 2.1256 |
0.5716 | 8.0 | 140 | 2.0944 |
0.4414 | 8.57 | 150 | 2.1695 |
0.415 | 9.14 | 160 | 2.3015 |
0.3441 | 9.71 | 170 | 2.3119 |
0.2823 | 10.29 | 180 | 2.3515 |
0.2576 | 10.86 | 190 | 2.3800 |
0.2074 | 11.43 | 200 | 2.4828 |
0.1984 | 12.0 | 210 | 2.4218 |
0.151 | 12.57 | 220 | 2.5316 |
0.144 | 13.14 | 230 | 2.6191 |
0.1255 | 13.71 | 240 | 2.6013 |
0.1086 | 14.29 | 250 | 2.6306 |
Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu118
- Datasets 2.13.0
- Tokenizers 0.13.3