metadata

license: other
library_name: peft
tags:
  - generated_from_trainer
base_model: beomi/gemma-ko-2b
model-index:
  - name: gemma2_on_korean_conv-stm
    results: []

gemma2_on_korean_conv-stm

This model is a fine-tuned version of beomi/gemma-ko-2b on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.1996

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 10
total_train_batch_size: 20
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 2000

Training results

Training Loss	Epoch	Step	Validation Loss
1.4813	0.2563	100	1.4715
1.3177	0.5126	200	1.3092
1.2445	0.7688	300	1.2380
1.0947	1.0251	400	1.1796
0.996	1.2814	500	1.1585
0.9617	1.5377	600	1.1360
0.9645	1.7940	700	1.1112
0.7718	2.0502	800	1.1270
0.7281	2.3065	900	1.1372
0.7437	2.5628	1000	1.1040
0.7588	2.8191	1100	1.0921
0.5759	3.0753	1200	1.1330
0.5811	3.3316	1300	1.1485
0.6025	3.5879	1400	1.1298
0.5766	3.8442	1500	1.1391
0.4555	4.1005	1600	1.1785
0.4426	4.3567	1700	1.1874
0.4461	4.6130	1800	1.1865
0.4506	4.8693	1900	1.1902
0.3731	5.1256	2000	1.1996

Framework versions

PEFT 0.11.1
Transformers 4.41.1
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1