c4-model

This model is a fine-tuned version of bowphs/pythia-70m-multi on the allenai/c4 en dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5532
  • Accuracy: 0.3716

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0.0000 1 10.7029 0.0164
No log 0.0001 2 10.5331 0.0496
No log 0.0001 4 10.3022 0.0533
No log 0.0003 8 10.0235 0.0536
No log 0.0005 16 9.6536 0.0635
No log 0.0011 32 9.0284 0.0759
No log 0.0021 64 8.0249 0.0832
No log 0.0043 128 6.9172 0.1129
No log 0.0085 256 6.1629 0.1558
No log 0.0171 512 5.5805 0.1817
No log 0.0341 1024 5.1235 0.2028
5.4529 0.0667 2000 4.7613 0.2264
5.4529 0.0683 2048 4.7481 0.2281
4.5765 0.1333 4000 4.4123 0.2610
4.5765 0.1365 4096 4.4043 0.2625
4.3252 0.2 6000 4.2221 0.2827
4.146 0.2667 8000 4.0350 0.3098
4.146 0.2731 8192 4.0134 0.3129
3.9652 0.3333 10000 3.8860 0.3304
3.8441 0.4 12000 3.8005 0.3418
3.7739 0.4667 14000 3.7315 0.3503
3.72 0.5333 16000 3.6880 0.3553
3.72 0.5461 16384 3.6777 0.3564
3.6718 0.6 18000 3.6533 0.3593
3.6527 0.6667 20000 3.6212 0.3633
3.6201 0.7333 22000 3.5985 0.3660
3.593 0.8 24000 3.5819 0.3679
3.5857 0.8667 26000 3.5683 0.3697
3.5801 0.9333 28000 3.5582 0.3711
3.5649 1.0 30000 3.5532 0.3716

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
0
Safetensors
Model size
70.4M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for bowphs/c4-model

Finetuned
(1)
this model

Dataset used to train bowphs/c4-model

Evaluation results