nyansapo_model_v2

This model is a fine-tuned version of Nzyoka19/nyansapo_model_v1 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-06
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 200
num_epochs: 8

Training Loss	Epoch	Step	Validation Loss	Wer
0.1565	0.6211	100	0.0309	19.6970
0.1518	1.2422	200	0.0328	21.5152
0.1585	1.8634	300	0.0293	19.7980
0.1341	2.4845	400	0.0289	18.0808
0.1516	3.1056	500	0.0302	18.0808
0.1196	3.7267	600	0.0298	18.6869
0.1494	4.3478	700	0.0284	18.1818
0.1337	4.9689	800	0.0302	17.2727
0.139	5.5901	900	0.0297	17.8788
0.1356	6.2112	1000	0.0299	17.7778
0.121	6.8323	1100	0.0302	17.5758
0.091	7.4534	1200	0.0302	17.4747

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

Finetuned

(1)

this model