vietcuna-3b_1024 / README.md
duyvt6663's picture
Training in progress, step 100
e9a1db0
|
raw
history blame
2.56 kB
metadata
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: vietcuna-3b_2048
    results: []

vietcuna-3b_2048

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5250
  • Accuracy: 0.7375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.18
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.5694 1.05 50 0.5834 0.7087
0.5614 2.1 100 0.5772 0.7165
0.5475 3.15 150 0.5684 0.7165
0.5503 4.2 200 0.5605 0.7087
0.5305 5.25 250 0.5784 0.7192
0.5353 6.3 300 0.5451 0.7323
0.5063 7.35 350 0.5441 0.7270
0.5141 8.4 400 0.5365 0.7244
0.5035 9.45 450 0.5354 0.7297
0.493 10.5 500 0.5322 0.7297
0.4763 11.55 550 0.5299 0.7375
0.5063 12.6 600 0.5295 0.7375
0.4787 13.65 650 0.5280 0.7297
0.4841 14.7 700 0.5266 0.7375
0.4732 15.75 750 0.5283 0.7297
0.4801 16.8 800 0.5259 0.7375
0.4651 17.85 850 0.5256 0.7375
0.4726 18.9 900 0.5260 0.7323
0.4758 19.95 950 0.5248 0.7375
0.4701 21.0 1000 0.5250 0.7375

Framework versions

  • Transformers 4.35.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1