RylanSchaeffer
/

collapse_gemma-2-2b_hs2_replace_iter2_sftsd0

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-2b_hs2_replace_iter2_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4538
Num Input Tokens Seen: 4832464

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 16
seed: 0
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.6784	0.0591	5	1.2633	282096
1.3537	0.1183	10	1.1871	571576
1.0696	0.1774	15	1.2164	857160
0.9162	0.2365	20	1.2391	1142344
0.7598	0.2956	25	1.3479	1427536
0.5372	0.3548	30	1.4227	1715736
0.4796	0.4139	35	1.4737	2003760
0.3889	0.4730	40	1.5021	2286384
0.1994	0.5322	45	1.5032	2573248
0.3391	0.5913	50	1.4714	2862104
0.3297	0.6504	55	1.4358	3145472
0.2038	0.7095	60	1.4488	3432144
0.195	0.7687	65	1.4273	3724448
0.1749	0.8278	70	1.4248	4016736
0.1654	0.8869	75	1.4554	4305224
0.1846	0.9460	80	1.4274	4595952

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 6

Safetensors

Model size

2.61B params

Tensor type

BF16

·

Inference API

Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter2_sftsd0

Base model

google/gemma-2-2b

Finetuned

(470)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard