Edit model card

hubert-rinnna-jp-jdrtsp-fw07sp-14

This model is a fine-tuned version of rinna/japanese-hubert-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1685
  • Wer: 0.2927
  • Cer: 0.1710

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Wer Cer
4.4197 1.0 404 4.0767 0.9928 0.9960
2.8984 2.0 808 2.7950 0.9928 0.9960
2.1179 3.0 1212 1.9178 0.9928 0.9960
1.4282 4.0 1616 1.0854 0.6262 0.4791
1.0793 5.0 2020 0.7672 0.4996 0.2944
0.9064 6.0 2424 0.6212 0.4573 0.2737
0.8366 7.0 2828 0.5247 0.4132 0.2450
0.7425 8.0 3232 0.4502 0.3786 0.2257
0.7017 9.0 3636 0.3912 0.3509 0.2082
0.6275 10.0 4040 0.3407 0.3328 0.1979
0.5853 11.0 4444 0.3045 0.3226 0.1920
0.5551 12.0 4848 0.2657 0.3139 0.1865
0.5105 13.0 5252 0.2455 0.3086 0.1827
0.5073 14.0 5656 0.2389 0.3092 0.1832
0.4722 15.0 6060 0.2170 0.3030 0.1781
0.481 16.0 6464 0.2089 0.3023 0.1783
0.4738 17.0 6868 0.2002 0.3004 0.1763
0.4518 18.0 7272 0.1990 0.3006 0.1765
0.4402 19.0 7676 0.1900 0.2999 0.1764
0.4387 20.0 8080 0.1826 0.2970 0.1740
0.4212 21.0 8484 0.1767 0.2955 0.1733
0.3893 22.0 8888 0.1707 0.2937 0.1719
0.4055 23.0 9292 0.1704 0.2943 0.1723
0.394 24.0 9696 0.1684 0.2934 0.1716
0.3997 25.0 10100 0.1685 0.2927 0.1710

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
3

Finetuned from