RawMean's picture
update model card README.md
be4de36
|
raw
history blame
2.26 kB
metadata
license: mit
tags:
  - generated_from_trainer
model-index:
  - name: farsi_lastname_classifier
    results: []

farsi_lastname_classifier

This model is a fine-tuned version of microsoft/deberta-v3-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0501
  • Pearson: 0.9174

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 128
  • eval_batch_size: 256
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Pearson
No log 1.0 12 0.3069 0.3652
No log 2.0 24 0.2113 0.6984
No log 3.0 36 0.1256 0.7083
No log 4.0 48 0.1069 0.7647
No log 5.0 60 0.0523 0.8959
No log 6.0 72 0.0620 0.8976
No log 7.0 84 0.0581 0.8984
No log 8.0 96 0.0458 0.9189
No log 9.0 108 0.0520 0.9146
No log 10.0 120 0.0490 0.9212
No log 11.0 132 0.0466 0.9230
No log 12.0 144 0.0491 0.9198
No log 13.0 156 0.0495 0.9196
No log 14.0 168 0.0499 0.9176
No log 15.0 180 0.0501 0.9174

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.12.1+cu113
  • Datasets 2.6.1
  • Tokenizers 0.13.2