gemma-2b-spanishbillionwords

This model is a fine-tuned version of google/gemma-2b on Spanish Billion Words. This is the base Gemma model fine-tuned to perform better on spanish language. It achieves the following results on the evaluation set:

Loss: 4.3306

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 2
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1
training_steps: 60
mixed_precision_training: Native AMP

Training results

Training Loss	Step	Validation Loss
5.1254	1	5.0205
4.3187	2	5.0029
3.8173	3	4.9801
5.3879	4	4.9582
5.718	5	4.9343
5.8628	6	4.9104
4.5401	7	4.8830
4.4219	8	4.8539
5.5169	9	4.8234
4.813	10	4.7878
4.2111	11	4.7576
4.6504	12	4.7314
3.7923	13	4.7116
3.7773	14	4.6890
4.6773	15	4.6616
3.0179	16	4.6329
3.8922	17	4.6099
4.3289	18	4.5940
5.0925	19	4.5822
4.6499	20	4.5711
3.9758	21	4.5585
4.593	22	4.5454
5.2496	23	4.5346
4.2548	24	4.5217
3.5209	25	4.5059
4.4781	26	4.4930
5.4472	27	4.4834
4.1987	28	4.4756
5.2324	29	4.4684
4.8068	30	4.4593
3.5455	31	4.4521
3.6516	32	4.4415
4.1368	33	4.4289
6.4659	34	4.4289
3.434	35	4.4173
3.9518	36	4.4085
3.0758	37	4.4008
3.6492	38	4.3930
4.0352	39	4.3857
5.6527	40	4.3799
4.233	41	4.3747
5.4082	42	4.3702
5.1255	43	4.3661
4.4567	44	4.3622
4.1874	45	4.3587
4.3441	46	4.3555
4.1636	47	4.3524
4.3146	48	4.3495
4.6414	49	4.3473
4.3666	50	4.3451
3.8627	51	4.3427
4.5875	52	4.3406
6.0364	53	4.3387
4.5669	54	4.3369
4.5585	55	4.3353
2.7858	56	4.3340
4.1845	57	4.3329
4.4489	58	4.3319
5.3263	59	4.3311
5.3856	60	4.3306

Framework versions

PEFT 0.8.2
Transformers 4.38.0
Pytorch 2.2.1+cu121
Datasets 2.17.0
Tokenizers 0.15.2

mcamara
/

gemma-2b-spanishbillionwords

gemma-2b-spanishbillionwords

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for mcamara/gemma-2b-spanishbillionwords

Evaluation results