Edit model card

gemma-2b-g

This model is a fine-tuned version of google/gemma-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9563

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
No log 0.016 2 0.9410
No log 0.032 4 0.9443
No log 0.048 6 0.9413
No log 0.064 8 0.9398
No log 0.08 10 0.9401
No log 0.096 12 0.9406
No log 0.112 14 0.9404
No log 0.128 16 0.9409
No log 0.144 18 0.9412
No log 0.16 20 0.9412
No log 0.176 22 0.9411
No log 0.192 24 0.9408
No log 0.208 26 0.9412
No log 0.224 28 0.9411
No log 0.24 30 0.9408
No log 0.256 32 0.9406
No log 0.272 34 0.9404
No log 0.288 36 0.9406
No log 0.304 38 0.9409
No log 0.32 40 0.9414
No log 0.336 42 0.9419
No log 0.352 44 0.9425
No log 0.368 46 0.9425
No log 0.384 48 0.9416
No log 0.4 50 0.9408
No log 0.416 52 0.9403
No log 0.432 54 0.9398
No log 0.448 56 0.9393
No log 0.464 58 0.9385
No log 0.48 60 0.9390
No log 0.496 62 0.9394
No log 0.512 64 0.9392
No log 0.528 66 0.9386
No log 0.544 68 0.9385
No log 0.56 70 0.9380
No log 0.576 72 0.9373
No log 0.592 74 0.9369
No log 0.608 76 0.9367
No log 0.624 78 0.9369
No log 0.64 80 0.9370
No log 0.656 82 0.9371
No log 0.672 84 0.9366
No log 0.688 86 0.9361
No log 0.704 88 0.9361
No log 0.72 90 0.9354
No log 0.736 92 0.9352
No log 0.752 94 0.9354
No log 0.768 96 0.9352
No log 0.784 98 0.9350
No log 0.8 100 0.9349
No log 0.816 102 0.9353
No log 0.832 104 0.9349
No log 0.848 106 0.9346
No log 0.864 108 0.9341
No log 0.88 110 0.9335
No log 0.896 112 0.9327
No log 0.912 114 0.9321
No log 0.928 116 0.9323
No log 0.944 118 0.9327
No log 0.96 120 0.9325
No log 0.976 122 0.9318
No log 0.992 124 0.9316
No log 1.008 126 0.9321
No log 1.024 128 0.9332
No log 1.04 130 0.9351
No log 1.056 132 0.9370
No log 1.072 134 0.9383
No log 1.088 136 0.9390
No log 1.104 138 0.9386
No log 1.12 140 0.9378
No log 1.1360 142 0.9375
No log 1.152 144 0.9380
No log 1.168 146 0.9380
No log 1.184 148 0.9376
No log 1.2 150 0.9381
No log 1.216 152 0.9390
No log 1.232 154 0.9400
No log 1.248 156 0.9410
No log 1.264 158 0.9411
No log 1.28 160 0.9405
No log 1.296 162 0.9402
No log 1.312 164 0.9400
No log 1.328 166 0.9399
No log 1.3440 168 0.9397
No log 1.3600 170 0.9398
No log 1.376 172 0.9403
No log 1.392 174 0.9412
No log 1.408 176 0.9424
No log 1.424 178 0.9432
No log 1.44 180 0.9417
No log 1.456 182 0.9403
No log 1.472 184 0.9397
No log 1.488 186 0.9393
No log 1.504 188 0.9391
No log 1.52 190 0.9385
No log 1.536 192 0.9385
No log 1.552 194 0.9387
No log 1.568 196 0.9393
No log 1.584 198 0.9402
No log 1.6 200 0.9410
No log 1.616 202 0.9410
No log 1.6320 204 0.9417
No log 1.6480 206 0.9414
No log 1.6640 208 0.9410
No log 1.6800 210 0.9402
No log 1.696 212 0.9400
No log 1.712 214 0.9398
No log 1.728 216 0.9397
No log 1.744 218 0.9395
No log 1.76 220 0.9398
No log 1.776 222 0.9400
No log 1.792 224 0.9403
No log 1.808 226 0.9403
No log 1.8240 228 0.9399
No log 1.8400 230 0.9392
No log 1.8560 232 0.9385
No log 1.8720 234 0.9385
No log 1.888 236 0.9390
No log 1.904 238 0.9394
No log 1.92 240 0.9395
No log 1.936 242 0.9392
No log 1.952 244 0.9391
No log 1.968 246 0.9390
No log 1.984 248 0.9386
No log 2.0 250 0.9380
No log 2.016 252 0.9381
No log 2.032 254 0.9401
No log 2.048 256 0.9431
No log 2.064 258 0.9469
No log 2.08 260 0.9507
No log 2.096 262 0.9529
No log 2.112 264 0.9524
No log 2.128 266 0.9501
No log 2.144 268 0.9478
No log 2.16 270 0.9466
No log 2.176 272 0.9463
No log 2.192 274 0.9458
No log 2.208 276 0.9454
No log 2.224 278 0.9451
No log 2.24 280 0.9456
No log 2.2560 282 0.9468
No log 2.2720 284 0.9477
No log 2.288 286 0.9484
No log 2.304 288 0.9486
No log 2.32 290 0.9479
No log 2.336 292 0.9473
No log 2.352 294 0.9473
No log 2.368 296 0.9473
No log 2.384 298 0.9475
No log 2.4 300 0.9479
No log 2.416 302 0.9490
No log 2.432 304 0.9499
No log 2.448 306 0.9501
No log 2.464 308 0.9498
No log 2.48 310 0.9491
No log 2.496 312 0.9489
No log 2.512 314 0.9490
No log 2.528 316 0.9487
No log 2.544 318 0.9483
No log 2.56 320 0.9483
No log 2.576 322 0.9483
No log 2.592 324 0.9485
No log 2.608 326 0.9487
No log 2.624 328 0.9492
No log 2.64 330 0.9493
No log 2.656 332 0.9488
No log 2.672 334 0.9487
No log 2.6880 336 0.9486
No log 2.7040 338 0.9485
No log 2.7200 340 0.9481
No log 2.7360 342 0.9477
No log 2.752 344 0.9478
No log 2.768 346 0.9482
No log 2.784 348 0.9487
No log 2.8 350 0.9483
No log 2.816 352 0.9481
No log 2.832 354 0.9480
No log 2.848 356 0.9480
No log 2.864 358 0.9479
No log 2.88 360 0.9481
No log 2.896 362 0.9484
No log 2.912 364 0.9488
No log 2.928 366 0.9490
No log 2.944 368 0.9489
No log 2.96 370 0.9487
No log 2.976 372 0.9484
No log 2.992 374 0.9476
No log 3.008 376 0.9468
No log 3.024 378 0.9471
No log 3.04 380 0.9481
No log 3.056 382 0.9499
No log 3.072 384 0.9521
No log 3.088 386 0.9543
No log 3.104 388 0.9562
No log 3.12 390 0.9572
No log 3.136 392 0.9577
No log 3.152 394 0.9577
No log 3.168 396 0.9577
No log 3.184 398 0.9574
No log 3.2 400 0.9570
No log 3.216 402 0.9569
No log 3.232 404 0.9567
No log 3.248 406 0.9565
No log 3.2640 408 0.9564
No log 3.2800 410 0.9562
No log 3.296 412 0.9561
No log 3.312 414 0.9561
No log 3.328 416 0.9562
No log 3.344 418 0.9565
No log 3.36 420 0.9568
No log 3.376 422 0.9570
No log 3.392 424 0.9572
No log 3.408 426 0.9573
No log 3.424 428 0.9572
No log 3.44 430 0.9569
No log 3.456 432 0.9570
No log 3.472 434 0.9572
No log 3.488 436 0.9574
No log 3.504 438 0.9575
No log 3.52 440 0.9577
No log 3.536 442 0.9577
No log 3.552 444 0.9578
No log 3.568 446 0.9579
No log 3.584 448 0.9577
No log 3.6 450 0.9575
No log 3.616 452 0.9575
No log 3.632 454 0.9575
No log 3.648 456 0.9576
No log 3.664 458 0.9576
No log 3.68 460 0.9574
No log 3.6960 462 0.9573
No log 3.7120 464 0.9571
No log 3.7280 466 0.9569
No log 3.7440 468 0.9567
No log 3.76 470 0.9565
No log 3.776 472 0.9563
No log 3.792 474 0.9563
No log 3.808 476 0.9563
No log 3.824 478 0.9564
No log 3.84 480 0.9565
No log 3.856 482 0.9565
No log 3.872 484 0.9566
No log 3.888 486 0.9566
No log 3.904 488 0.9565
No log 3.92 490 0.9565
No log 3.936 492 0.9565
No log 3.952 494 0.9564
No log 3.968 496 0.9564
No log 3.984 498 0.9564
0.814 4.0 500 0.9563

Framework versions

  • PEFT 0.10.1.dev0
  • Transformers 4.40.0.dev0
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.19.1
Downloads last month
5
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for himanshue2e/gemma-2b-g

Base model

google/gemma-2b
Adapter
(23370)
this model