gommt / README.md
sizhkhy's picture
Upload folder using huggingface_hub
8fb8455 verified
metadata
library_name: peft
license: other
base_model: unsloth/Llama-3.2-3B-Instruct
tags:
  - llama-factory
  - lora
  - unsloth
  - generated_from_trainer
model-index:
  - name: llm3br256
    results: []

llm3br256

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the gommt dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0206

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 25.0

Training results

Training Loss Epoch Step Validation Loss
0.232 0.1613 25 0.2193
0.1524 0.3226 50 0.1507
0.115 0.4839 75 0.1165
0.0875 0.6452 100 0.1004
0.092 0.8065 125 0.0909
0.1077 0.9677 150 0.0900
0.0688 1.1290 175 0.0778
0.0682 1.2903 200 0.0723
0.0621 1.4516 225 0.0668
0.0668 1.6129 250 0.0646
0.0672 1.7742 275 0.0587
0.0484 1.9355 300 0.0544
0.0468 2.0968 325 0.0516
0.0438 2.2581 350 0.0503
0.0364 2.4194 375 0.0493
0.0365 2.5806 400 0.0460
0.0469 2.7419 425 0.0432
0.027 2.9032 450 0.0379
0.026 3.0645 475 0.0356
0.0223 3.2258 500 0.0357
0.0228 3.3871 525 0.0352
0.0199 3.5484 550 0.0336
0.0227 3.7097 575 0.0308
0.0207 3.8710 600 0.0292
0.0125 4.0323 625 0.0304
0.0146 4.1935 650 0.0279
0.0126 4.3548 675 0.0283
0.0141 4.5161 700 0.0270
0.0133 4.6774 725 0.0254
0.0098 4.8387 750 0.0250
0.0093 5.0 775 0.0234
0.0073 5.1613 800 0.0247
0.0087 5.3226 825 0.0254
0.0102 5.4839 850 0.0242
0.0077 5.6452 875 0.0230
0.0085 5.8065 900 0.0230
0.0069 5.9677 925 0.0213
0.0056 6.1290 950 0.0226
0.0063 6.2903 975 0.0224
0.0055 6.4516 1000 0.0227
0.0067 6.6129 1025 0.0229
0.0052 6.7742 1050 0.0224
0.008 6.9355 1075 0.0219
0.0053 7.0968 1100 0.0227
0.0049 7.2581 1125 0.0220
0.0059 7.4194 1150 0.0218
0.0045 7.5806 1175 0.0215
0.0058 7.7419 1200 0.0206
0.0047 7.9032 1225 0.0207
0.0043 8.0645 1250 0.0223
0.0046 8.2258 1275 0.0218
0.0036 8.3871 1300 0.0225
0.0034 8.5484 1325 0.0216

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.4.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3