train_record_42_1779354540

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3537
  • Num Input Tokens Seen: 49166912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.764 0.0501 782 0.6362 2474432
0.6569 0.1001 1564 0.5395 4931328
0.5165 0.1502 2346 0.5076 7397056
0.5774 0.2002 3128 0.4884 9832064
0.4226 0.2503 3910 0.4644 12304064
0.3784 0.3004 4692 0.4629 14775488
0.579 0.3504 5474 0.4391 17259840
0.3547 0.4005 6256 0.4233 19707456
0.3794 0.4505 7038 0.4316 22178432
0.3939 0.5006 7820 0.4134 24646208
0.3382 0.5507 8602 0.3985 27101056
0.4 0.6007 9384 0.3917 29544576
0.3188 0.6508 10166 0.3784 32010176
0.4233 0.7009 10948 0.3722 34475136
0.2995 0.7509 11730 0.3648 36931648
0.3367 0.8010 12512 0.3636 39382144
0.272 0.8510 13294 0.3571 41847872
0.3451 0.9011 14076 0.3544 44318848
0.2502 0.9512 14858 0.3537 46767552

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
284
Safetensors
Model size
1B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
Input a message to start chatting with rbelanec/train_record_42_1779354540.

Model tree for rbelanec/train_record_42_1779354540

Finetuned
(1749)
this model