anashrivastava's picture
Training in progress, epoch 0
f9ce20e verified
|
raw
history blame
32.3 kB
metadata
license: apache-2.0
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: unsloth/tinyllama-bnb-4bit
model-index:
  - name: tinyllama-rephraser-lora
    results: []

tinyllama-rephraser-lora

This model is a fine-tuned version of unsloth/tinyllama-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6524

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 4
  • seed: 3407
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
2.3576 0.01 1 2.4199
2.4431 0.02 2 2.4179
2.5987 0.02 3 2.4198
2.3902 0.03 4 2.4175
2.4699 0.04 5 2.4176
2.408 0.05 6 2.4171
2.4169 0.06 7 2.4192
2.4351 0.07 8 2.4169
2.3709 0.07 9 2.4181
2.4303 0.08 10 2.4174
2.3925 0.09 11 2.4162
2.4358 0.1 12 2.4150
2.5119 0.11 13 2.4158
2.4336 0.12 14 2.4139
2.3145 0.12 15 2.4137
2.3547 0.13 16 2.4139
2.4008 0.14 17 2.4124
2.3856 0.15 18 2.4107
2.4702 0.16 19 2.4101
2.4401 0.16 20 2.4101
2.4957 0.17 21 2.4074
2.4915 0.18 22 2.4044
2.3463 0.19 23 2.4051
2.2404 0.2 24 2.4044
2.469 0.21 25 2.4023
2.4707 0.21 26 2.4002
2.4167 0.22 27 2.4003
2.3213 0.23 28 2.3969
2.5036 0.24 29 2.3942
2.3594 0.25 30 2.3920
2.3971 0.26 31 2.3910
2.3863 0.26 32 2.3880
2.4845 0.27 33 2.3836
2.3076 0.28 34 2.3844
2.4448 0.29 35 2.3810
2.3576 0.3 36 2.3789
2.3279 0.3 37 2.3744
2.4572 0.31 38 2.3744
2.4527 0.32 39 2.3702
2.5135 0.33 40 2.3645
2.3704 0.34 41 2.3638
2.4071 0.35 42 2.3586
2.3059 0.35 43 2.3567
2.4828 0.36 44 2.3525
2.3812 0.37 45 2.3474
2.3066 0.38 46 2.3432
2.3644 0.39 47 2.3396
2.3855 0.4 48 2.3357
2.3533 0.4 49 2.3299
2.3486 0.41 50 2.3252
2.3527 0.42 51 2.3194
2.3593 0.43 52 2.3150
2.3743 0.44 53 2.3117
2.5021 0.44 54 2.3026
2.3785 0.45 55 2.2974
2.17 0.46 56 2.2931
2.2599 0.47 57 2.2851
2.2211 0.48 58 2.2794
2.2464 0.49 59 2.2716
2.2058 0.49 60 2.2622
2.3164 0.5 61 2.2560
2.3093 0.51 62 2.2445
2.2846 0.52 63 2.2353
2.1138 0.53 64 2.2271
2.3286 0.53 65 2.2170
2.1988 0.54 66 2.2077
2.2598 0.55 67 2.1968
2.2303 0.56 68 2.1880
2.2419 0.57 69 2.1790
2.2628 0.58 70 2.1689
2.1922 0.58 71 2.1573
2.2353 0.59 72 2.1498
2.2111 0.6 73 2.1376
2.1128 0.61 74 2.1271
2.2167 0.62 75 2.1184
2.1878 0.63 76 2.1085
2.1391 0.63 77 2.0950
2.1336 0.64 78 2.0818
2.1266 0.65 79 2.0730
1.9823 0.66 80 2.0634
2.1003 0.67 81 2.0490
2.0681 0.67 82 2.0353
2.1475 0.68 83 2.0218
1.996 0.69 84 2.0082
1.9981 0.7 85 1.9951
2.0693 0.71 86 1.9823
1.9524 0.72 87 1.9719
2.058 0.72 88 1.9587
1.9211 0.73 89 1.9455
1.9496 0.74 90 1.9311
1.9495 0.75 91 1.9200
2.0249 0.76 92 1.9062
1.9044 0.77 93 1.8940
1.9373 0.77 94 1.8817
1.8604 0.78 95 1.8675
1.8957 0.79 96 1.8584
1.8673 0.8 97 1.8471
1.9002 0.81 98 1.8337
1.9025 0.81 99 1.8226
1.8356 0.82 100 1.8099
1.7723 0.83 101 1.7994
1.7628 0.84 102 1.7902
1.7014 0.85 103 1.7788
1.7383 0.86 104 1.7678
1.7647 0.86 105 1.7581
1.7835 0.87 106 1.7466
1.7645 0.88 107 1.7367
1.7654 0.89 108 1.7267
1.8344 0.9 109 1.7173
1.6528 0.91 110 1.7068
1.676 0.91 111 1.6989
1.5894 0.92 112 1.6882
1.6154 0.93 113 1.6778
1.609 0.94 114 1.6697
1.6803 0.95 115 1.6592
1.6487 0.95 116 1.6484
1.6905 0.96 117 1.6403
1.6258 0.97 118 1.6300
1.5744 0.98 119 1.6189
1.4791 0.99 120 1.6088
1.6177 1.0 121 1.5975
1.582 1.0 122 1.5860
1.5378 1.01 123 1.5760
1.5691 1.02 124 1.5658
1.5387 1.03 125 1.5540
1.5527 1.04 126 1.5429
1.4642 1.05 127 1.5316
1.4867 1.05 128 1.5199
1.5204 1.06 129 1.5075
1.5997 1.07 130 1.4962
1.4419 1.08 131 1.4833
1.5799 1.09 132 1.4718
1.5103 1.09 133 1.4584
1.5444 1.1 134 1.4472
1.4835 1.11 135 1.4351
1.4326 1.12 136 1.4213
1.4079 1.13 137 1.4088
1.5206 1.14 138 1.3971
1.3868 1.14 139 1.3822
1.4778 1.15 140 1.3702
1.4627 1.16 141 1.3558
1.3555 1.17 142 1.3444
1.3143 1.18 143 1.3323
1.3754 1.19 144 1.3192
1.2488 1.19 145 1.3082
1.2821 1.2 146 1.2969
1.2804 1.21 147 1.2856
1.233 1.22 148 1.2747
1.3502 1.23 149 1.2633
1.2224 1.23 150 1.2536
1.199 1.24 151 1.2419
1.1749 1.25 152 1.2321
1.305 1.26 153 1.2220
1.1391 1.27 154 1.2100
1.3063 1.28 155 1.1990
1.2402 1.28 156 1.1878
1.1104 1.29 157 1.1772
1.24 1.3 158 1.1670
1.0549 1.31 159 1.1555
1.1417 1.32 160 1.1452
1.0898 1.33 161 1.1351
1.1035 1.33 162 1.1259
1.1088 1.34 163 1.1158
1.086 1.35 164 1.1069
1.15 1.36 165 1.0975
1.0394 1.37 166 1.0888
1.1268 1.37 167 1.0806
1.0803 1.38 168 1.0710
1.0198 1.39 169 1.0624
1.0765 1.4 170 1.0534
1.0318 1.41 171 1.0447
1.0098 1.42 172 1.0369
1.0013 1.42 173 1.0284
0.9773 1.43 174 1.0210
1.0233 1.44 175 1.0130
0.985 1.45 176 1.0053
0.9806 1.46 177 0.9983
1.0393 1.47 178 0.9906
0.9191 1.47 179 0.9844
0.9454 1.48 180 0.9781
0.9354 1.49 181 0.9710
0.9598 1.5 182 0.9658
1.0652 1.51 183 0.9584
0.9002 1.51 184 0.9538
0.9477 1.52 185 0.9472
0.9203 1.53 186 0.9414
0.8837 1.54 187 0.9361
0.91 1.55 188 0.9313
0.8616 1.56 189 0.9258
0.9201 1.56 190 0.9205
0.9408 1.57 191 0.9147
0.9274 1.58 192 0.9093
1.0009 1.59 193 0.9064
0.9202 1.6 194 0.9009
0.9886 1.6 195 0.8959
0.9289 1.61 196 0.8913
0.9603 1.62 197 0.8875
0.9138 1.63 198 0.8837
0.8794 1.64 199 0.8787
0.8315 1.65 200 0.8750
0.8745 1.65 201 0.8705
1.013 1.66 202 0.8673
0.8565 1.67 203 0.8634
0.9121 1.68 204 0.8596
0.7825 1.69 205 0.8558
0.9171 1.7 206 0.8524
0.7595 1.7 207 0.8488
0.8611 1.71 208 0.8453
0.7212 1.72 209 0.8421
0.8745 1.73 210 0.8389
0.93 1.74 211 0.8354
0.9183 1.74 212 0.8321
0.8482 1.75 213 0.8293
0.8155 1.76 214 0.8256
0.9113 1.77 215 0.8224
0.8009 1.78 216 0.8190
0.6555 1.79 217 0.8165
0.7727 1.79 218 0.8133
0.7987 1.8 219 0.8105
0.7794 1.81 220 0.8074
0.8248 1.82 221 0.8043
0.7818 1.83 222 0.8020
0.741 1.84 223 0.7995
0.6907 1.84 224 0.7969
0.789 1.85 225 0.7938
0.7101 1.86 226 0.7910
0.7178 1.87 227 0.7887
0.7109 1.88 228 0.7865
0.6699 1.88 229 0.7838
0.8443 1.89 230 0.7814
0.7397 1.9 231 0.7789
0.7888 1.91 232 0.7760
0.7725 1.92 233 0.7735
0.7797 1.93 234 0.7707
0.7988 1.93 235 0.7678
0.7548 1.94 236 0.7660
0.904 1.95 237 0.7631
0.8183 1.96 238 0.7616
0.8292 1.97 239 0.7582
0.7144 1.98 240 0.7561
0.753 1.98 241 0.7538
0.7629 1.99 242 0.7525
0.8713 2.0 243 0.7497
0.7355 2.01 244 0.7477
0.6998 2.02 245 0.7459
0.7567 2.02 246 0.7438
0.6594 2.03 247 0.7420
0.7124 2.04 248 0.7405
0.9188 2.05 249 0.7380
0.7406 2.06 250 0.7364
0.7091 2.07 251 0.7341
0.8144 2.07 252 0.7319
0.7122 2.08 253 0.7307
0.7504 2.09 254 0.7291
0.7409 2.1 255 0.7276
0.7844 2.11 256 0.7258
0.8328 2.12 257 0.7234
0.7149 2.12 258 0.7221
0.7063 2.13 259 0.7205
0.6629 2.14 260 0.7195
0.5896 2.15 261 0.7177
0.734 2.16 262 0.7165
0.7293 2.16 263 0.7157
0.6819 2.17 264 0.7142
0.6928 2.18 265 0.7133
0.6026 2.19 266 0.7119
0.6704 2.2 267 0.7114
0.7118 2.21 268 0.7099
0.8447 2.21 269 0.7084
0.6857 2.22 270 0.7075
0.7257 2.23 271 0.7066
0.6884 2.24 272 0.7058
0.5883 2.25 273 0.7047
0.6798 2.26 274 0.7036
0.6575 2.26 275 0.7024
0.627 2.27 276 0.7017
0.7029 2.28 277 0.7016
0.7248 2.29 278 0.7009
0.6947 2.3 279 0.6996
0.708 2.3 280 0.6991
0.5384 2.31 281 0.6981
0.5539 2.32 282 0.6975
0.6751 2.33 283 0.6962
0.5809 2.34 284 0.6957
0.7105 2.35 285 0.6952
0.735 2.35 286 0.6945
0.7564 2.36 287 0.6936
0.732 2.37 288 0.6925
0.6892 2.38 289 0.6919
0.6454 2.39 290 0.6910
0.6919 2.4 291 0.6901
0.6842 2.4 292 0.6893
0.6044 2.41 293 0.6889
0.5893 2.42 294 0.6885
0.7235 2.43 295 0.6875
0.7216 2.44 296 0.6873
0.7677 2.44 297 0.6865
0.5953 2.45 298 0.6862
0.8029 2.46 299 0.6853
0.6425 2.47 300 0.6846
0.5764 2.48 301 0.6846
0.7721 2.49 302 0.6831
0.7315 2.49 303 0.6831
0.6483 2.5 304 0.6829
0.8087 2.51 305 0.6825
0.6676 2.52 306 0.6816
0.6153 2.53 307 0.6813
0.6388 2.53 308 0.6812
0.6322 2.54 309 0.6803
0.5539 2.55 310 0.6803
0.6124 2.56 311 0.6796
0.6905 2.57 312 0.6791
0.6522 2.58 313 0.6782
0.5722 2.58 314 0.6784
0.6271 2.59 315 0.6776
0.6927 2.6 316 0.6783
0.733 2.61 317 0.6768
0.6622 2.62 318 0.6765
0.7042 2.63 319 0.6765
0.8197 2.63 320 0.6763
0.8398 2.64 321 0.6758
0.6703 2.65 322 0.6756
0.6722 2.66 323 0.6750
0.7457 2.67 324 0.6748
0.6385 2.67 325 0.6746
0.557 2.68 326 0.6743
0.6835 2.69 327 0.6739
0.6078 2.7 328 0.6735
0.8021 2.71 329 0.6733
0.5652 2.72 330 0.6732
0.7898 2.72 331 0.6723
0.5717 2.73 332 0.6720
0.6912 2.74 333 0.6718
0.641 2.75 334 0.6717
0.6551 2.76 335 0.6714
0.7743 2.77 336 0.6706
0.631 2.77 337 0.6710
0.6843 2.78 338 0.6703
0.6913 2.79 339 0.6701
0.6482 2.8 340 0.6697
0.6251 2.81 341 0.6696
0.6712 2.81 342 0.6694
0.6543 2.82 343 0.6693
0.7393 2.83 344 0.6687
0.7283 2.84 345 0.6686
0.673 2.85 346 0.6686
0.6263 2.86 347 0.6680
0.6574 2.86 348 0.6678
0.7178 2.87 349 0.6677
0.6941 2.88 350 0.6673
0.5781 2.89 351 0.6675
0.6024 2.9 352 0.6671
0.6324 2.91 353 0.6667
0.7445 2.91 354 0.6663
0.5899 2.92 355 0.6664
0.7318 2.93 356 0.6659
0.7341 2.94 357 0.6656
0.7439 2.95 358 0.6656
0.7061 2.95 359 0.6652
0.7121 2.96 360 0.6649
0.6754 2.97 361 0.6649
0.7367 2.98 362 0.6646
0.7033 2.99 363 0.6646
0.6652 3.0 364 0.6640
0.707 3.0 365 0.6639
0.5992 3.01 366 0.6636
0.6483 3.02 367 0.6633
0.8483 3.03 368 0.6623
0.7052 3.04 369 0.6628
0.7748 3.05 370 0.6624
0.7242 3.05 371 0.6621
0.7835 3.06 372 0.6621
0.6273 3.07 373 0.6621
0.6937 3.08 374 0.6617
0.7308 3.09 375 0.6615
0.6431 3.09 376 0.6613
0.6486 3.1 377 0.6612
0.6671 3.11 378 0.6613
0.6046 3.12 379 0.6605
0.5741 3.13 380 0.6605
0.6746 3.14 381 0.6606
0.6525 3.14 382 0.6604
0.6483 3.15 383 0.6602
0.6631 3.16 384 0.6602
0.5769 3.17 385 0.6603
0.6648 3.18 386 0.6596
0.6933 3.19 387 0.6592
0.6597 3.19 388 0.6596
0.5871 3.2 389 0.6596
0.5976 3.21 390 0.6593
0.6025 3.22 391 0.6591
0.7157 3.23 392 0.6588
0.6419 3.23 393 0.6587
0.5579 3.24 394 0.6589
0.7142 3.25 395 0.6588
0.5773 3.26 396 0.6581
0.5624 3.27 397 0.6583
0.6029 3.28 398 0.6579
0.6642 3.28 399 0.6582
0.7 3.29 400 0.6579
0.7918 3.3 401 0.6579
0.563 3.31 402 0.6577
0.7208 3.32 403 0.6575
0.6769 3.33 404 0.6570
0.7093 3.33 405 0.6571
0.5287 3.34 406 0.6570
0.5828 3.35 407 0.6572
0.5703 3.36 408 0.6566
0.6647 3.37 409 0.6566
0.6879 3.37 410 0.6568
0.7325 3.38 411 0.6566
0.6021 3.39 412 0.6565
0.6777 3.4 413 0.6565
0.6057 3.41 414 0.6560
0.5996 3.42 415 0.6558
0.6841 3.42 416 0.6556
0.6096 3.43 417 0.6557
0.6245 3.44 418 0.6559
0.664 3.45 419 0.6556
0.7183 3.46 420 0.6561
0.6449 3.47 421 0.6558
0.6497 3.47 422 0.6557
0.8151 3.48 423 0.6554
0.813 3.49 424 0.6552
0.6278 3.5 425 0.6553
0.6376 3.51 426 0.6556
0.697 3.51 427 0.6554
0.628 3.52 428 0.6550
0.7049 3.53 429 0.6553
0.6641 3.54 430 0.6549
0.6465 3.55 431 0.6552
0.7366 3.56 432 0.6550
0.6325 3.56 433 0.6545
0.5621 3.57 434 0.6550
0.5846 3.58 435 0.6553
0.6516 3.59 436 0.6551
0.7258 3.6 437 0.6546
0.6027 3.6 438 0.6547
0.5344 3.61 439 0.6549
0.6988 3.62 440 0.6546
0.6863 3.63 441 0.6548
0.627 3.64 442 0.6544
0.6353 3.65 443 0.6548
0.5361 3.65 444 0.6541
0.6774 3.66 445 0.6548
0.668 3.67 446 0.6546
0.544 3.68 447 0.6545
0.5683 3.69 448 0.6546
0.6955 3.7 449 0.6543
0.6316 3.7 450 0.6543
0.647 3.71 451 0.6544
0.6797 3.72 452 0.6541
0.6566 3.73 453 0.6541
0.6585 3.74 454 0.6544
0.6632 3.74 455 0.6541
0.5798 3.75 456 0.6540
0.6417 3.76 457 0.6540
0.706 3.77 458 0.6538
0.6709 3.78 459 0.6542
0.7047 3.79 460 0.6536
0.5466 3.79 461 0.6538
0.5479 3.8 462 0.6540
0.6476 3.81 463 0.6535
0.6584 3.82 464 0.6534
0.6515 3.83 465 0.6540
0.5812 3.84 466 0.6535
0.6339 3.84 467 0.6537
0.6521 3.85 468 0.6537
0.6451 3.86 469 0.6538
0.6655 3.87 470 0.6532
0.7017 3.88 471 0.6533
0.5794 3.88 472 0.6530
0.6485 3.89 473 0.6536
0.6723 3.9 474 0.6533
0.72 3.91 475 0.6534
0.6114 3.92 476 0.6535
0.596 3.93 477 0.6536
0.5961 3.93 478 0.6538
0.6629 3.94 479 0.6531
0.6682 3.95 480 0.6534
0.7007 3.96 481 0.6534
0.6594 3.97 482 0.6535
0.7607 3.98 483 0.6531
0.5735 3.98 484 0.6532
0.7111 3.99 485 0.6531
0.6498 4.0 486 0.6533
0.624 4.01 487 0.6529
0.7284 4.02 488 0.6535
0.5665 4.02 489 0.6531
0.6473 4.03 490 0.6534
0.614 4.04 491 0.6534
0.6663 4.05 492 0.6528
0.6309 4.06 493 0.6527
0.6926 4.07 494 0.6530
0.6112 4.07 495 0.6531
0.6879 4.08 496 0.6526
0.6939 4.09 497 0.6529
0.7551 4.1 498 0.6530
0.6085 4.11 499 0.6530
0.6741 4.12 500 0.6533
0.5913 4.12 501 0.6529
0.6337 4.13 502 0.6529
0.6061 4.14 503 0.6527
0.6511 4.15 504 0.6529
0.6358 4.16 505 0.6531
0.6537 4.16 506 0.6527
0.5757 4.17 507 0.6532
0.6143 4.18 508 0.6529
0.5723 4.19 509 0.6530
0.5647 4.2 510 0.6528
0.5878 4.21 511 0.6531
0.6119 4.21 512 0.6527
0.743 4.22 513 0.6530
0.6942 4.23 514 0.6528
0.5967 4.24 515 0.6527
0.6869 4.25 516 0.6530
0.62 4.26 517 0.6529
0.7596 4.26 518 0.6530
0.6483 4.27 519 0.6530
0.6449 4.28 520 0.6527
0.56 4.29 521 0.6529
0.673 4.3 522 0.6527
0.5469 4.3 523 0.6527
0.6084 4.31 524 0.6528
0.5118 4.32 525 0.6527
0.7318 4.33 526 0.6529
0.7787 4.34 527 0.6525
0.7177 4.35 528 0.6532
0.6294 4.35 529 0.6531
0.6758 4.36 530 0.6527
0.6679 4.37 531 0.6526
0.5373 4.38 532 0.6525
0.6655 4.39 533 0.6529
0.6738 4.4 534 0.6527
0.6849 4.4 535 0.6528
0.5894 4.41 536 0.6530
0.7516 4.42 537 0.6533
0.7417 4.43 538 0.6530
0.6239 4.44 539 0.6529
0.6543 4.44 540 0.6528
0.6201 4.45 541 0.6529
0.6552 4.46 542 0.6528
0.5647 4.47 543 0.6529
0.6798 4.48 544 0.6530
0.6152 4.49 545 0.6528
0.7099 4.49 546 0.6531
0.7073 4.5 547 0.6528
0.76 4.51 548 0.6531
0.7266 4.52 549 0.6526
0.7659 4.53 550 0.6527
0.7033 4.53 551 0.6532
0.6679 4.54 552 0.6534
0.5671 4.55 553 0.6533
0.6845 4.56 554 0.6527
0.655 4.57 555 0.6524
0.7154 4.58 556 0.6526
0.7778 4.58 557 0.6524
0.6404 4.59 558 0.6524
0.6133 4.6 559 0.6526
0.6241 4.61 560 0.6525
0.6255 4.62 561 0.6527
0.5877 4.63 562 0.6526
0.7624 4.63 563 0.6526
0.613 4.64 564 0.6522
0.6014 4.65 565 0.6524
0.6217 4.66 566 0.6525
0.5651 4.67 567 0.6525
0.7227 4.67 568 0.6526
0.6247 4.68 569 0.6525
0.6886 4.69 570 0.6524
0.6894 4.7 571 0.6524
0.6543 4.71 572 0.6525
0.5932 4.72 573 0.6522
0.6069 4.72 574 0.6523
0.614 4.73 575 0.6525
0.5748 4.74 576 0.6526
0.5907 4.75 577 0.6523
0.6707 4.76 578 0.6526
0.642 4.77 579 0.6525
0.6228 4.77 580 0.6522
0.6178 4.78 581 0.6525
0.5958 4.79 582 0.6528
0.6532 4.8 583 0.6527
0.5752 4.81 584 0.6526
0.7058 4.81 585 0.6525
0.642 4.82 586 0.6526
0.6599 4.83 587 0.6525
0.7673 4.84 588 0.6526
0.6626 4.85 589 0.6525
0.5326 4.86 590 0.6525
0.6512 4.86 591 0.6524
0.5914 4.87 592 0.6524
0.6415 4.88 593 0.6523
0.7693 4.89 594 0.6523
0.6389 4.9 595 0.6524
0.6151 4.91 596 0.6524
0.6561 4.91 597 0.6524
0.6443 4.92 598 0.6524
0.6596 4.93 599 0.6524
0.6413 4.94 600 0.6524
0.6235 4.95 601 0.6524
0.598 4.95 602 0.6524
0.7232 4.96 603 0.6524
0.6172 4.97 604 0.6524
0.7063 4.98 605 0.6524

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2