Edit model card

Visualize in Weights & Biases

DEBUG.SUBSET.Meta-Llama-3-8B_chain_of_thought.gpu26_2024-07-22_11-37-36

This model is a fine-tuned version of unsloth/llama-3-8b-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1091

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.1228 0.1505 157 0.1327
0.1136 0.3011 314 0.1213
0.1252 0.4516 471 0.1173
0.1226 0.6021 628 0.1141
0.1185 0.7526 785 0.1129
0.1039 0.9032 942 0.1110
0.1024 1.0537 1099 0.1119
0.094 1.2042 1256 0.1115
0.0888 1.3547 1413 0.1116
0.0944 1.5053 1570 0.1107
0.0827 1.6558 1727 0.1101
0.0901 1.8063 1884 0.1100
0.087 1.9569 2041 0.1091
0.0668 2.1074 2198 0.1164
0.0648 2.2579 2355 0.1167
0.0672 2.4084 2512 0.1164
0.0692 2.5590 2669 0.1165
0.0732 2.7095 2826 0.1165
0.0686 2.8600 2983 0.1162

Framework versions

  • PEFT 0.11.1
  • Transformers 4.42.4
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for