LlamaCorn-1.1B-Chat / README.md
jan-hq's picture
Model save
5b9d863 verified
|
raw
history blame
No virus
21.5 kB
metadata
tags:
  - trl
  - dpo
  - generated_from_trainer
model-index:
  - name: LlamaCorn-1.1B-Chat
    results: []

LlamaCorn-1.1B-Chat

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9305
  • Rewards/chosen: -0.2148
  • Rewards/rejected: -0.2954
  • Rewards/accuracies: 0.5824
  • Rewards/margins: 0.0806
  • Logps/rejected: -183.8757
  • Logps/chosen: -197.7534
  • Logits/rejected: -2.6439
  • Logits/chosen: -2.6493

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 2
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.9958 0.03 100 1.0003 -0.0002 -0.0002 0.4930 -0.0001 -180.9232 -195.6078 -2.6876 -2.6924
0.9984 0.06 200 0.9995 -0.0007 -0.0013 0.4988 0.0006 -180.9347 -195.6127 -2.6787 -2.6838
0.9982 0.09 300 0.9997 -0.0008 -0.0015 0.4983 0.0007 -180.9361 -195.6136 -2.6848 -2.6897
0.9966 0.12 400 0.9999 -0.0024 -0.0027 0.4995 0.0003 -180.9485 -195.6291 -2.6865 -2.6914
0.9992 0.15 500 0.9984 -0.0039 -0.0054 0.5122 0.0015 -180.9753 -195.6440 -2.6641 -2.6694
0.9983 0.18 600 0.9981 -0.0054 -0.0073 0.5127 0.0020 -180.9945 -195.6589 -2.6862 -2.6911
0.9968 0.2 700 0.9972 -0.0093 -0.0127 0.5241 0.0034 -181.0485 -195.6985 -2.6753 -2.6803
0.9893 0.23 800 0.9951 -0.0114 -0.0164 0.5248 0.0051 -181.0858 -195.7188 -2.6676 -2.6728
0.988 0.26 900 0.9924 -0.0169 -0.0245 0.5421 0.0076 -181.1663 -195.7744 -2.6763 -2.6814
0.9879 0.29 1000 0.9907 -0.0220 -0.0318 0.5476 0.0098 -181.2388 -195.8248 -2.6746 -2.6796
0.9882 0.32 1100 0.9869 -0.0261 -0.0399 0.5598 0.0138 -181.3200 -195.8661 -2.6647 -2.6699
0.979 0.35 1200 0.9851 -0.0364 -0.0521 0.5563 0.0157 -181.4419 -195.9693 -2.6684 -2.6735
0.985 0.38 1300 0.9818 -0.0385 -0.0576 0.5608 0.0192 -181.4978 -195.9900 -2.6874 -2.6921
0.9821 0.41 1400 0.9805 -0.0462 -0.0668 0.5590 0.0206 -181.5891 -196.0672 -2.6761 -2.6810
0.9822 0.44 1500 0.9779 -0.0550 -0.0777 0.5632 0.0227 -181.6983 -196.1554 -2.6764 -2.6813
0.9755 0.47 1600 0.9756 -0.0600 -0.0855 0.5656 0.0255 -181.7764 -196.2058 -2.6502 -2.6557
0.9697 0.5 1700 0.9731 -0.0652 -0.0931 0.5651 0.0280 -181.8526 -196.2569 -2.6752 -2.6801
0.969 0.53 1800 0.9698 -0.0701 -0.1017 0.5687 0.0315 -181.9380 -196.3067 -2.6635 -2.6687
0.9643 0.55 1900 0.9685 -0.0762 -0.1092 0.5676 0.0331 -182.0137 -196.3669 -2.6590 -2.6642
0.9655 0.58 2000 0.9663 -0.0821 -0.1180 0.5756 0.0359 -182.1012 -196.4265 -2.6802 -2.6850
0.9719 0.61 2100 0.9645 -0.0908 -0.1281 0.5676 0.0373 -182.2023 -196.5133 -2.6677 -2.6727
0.9576 0.64 2200 0.9625 -0.0953 -0.1350 0.5729 0.0396 -182.2709 -196.5585 -2.6679 -2.6730
0.9619 0.67 2300 0.9603 -0.1012 -0.1436 0.5783 0.0424 -182.3572 -196.6170 -2.6527 -2.6580
0.9511 0.7 2400 0.9601 -0.1105 -0.1540 0.5722 0.0434 -182.4612 -196.7107 -2.6565 -2.6617
0.9516 0.73 2500 0.9570 -0.1158 -0.1618 0.5715 0.0460 -182.5389 -196.7630 -2.6613 -2.6664
0.9577 0.76 2600 0.9554 -0.1236 -0.1717 0.5719 0.0481 -182.6387 -196.8413 -2.6595 -2.6646
0.9471 0.79 2700 0.9541 -0.1268 -0.1763 0.5736 0.0495 -182.6840 -196.8731 -2.6621 -2.6672
0.9519 0.82 2800 0.9524 -0.1336 -0.1849 0.5738 0.0513 -182.7705 -196.9414 -2.6762 -2.6810
0.9522 0.85 2900 0.9515 -0.1364 -0.1896 0.5724 0.0531 -182.8170 -196.9696 -2.6604 -2.6655
0.9414 0.88 3000 0.9491 -0.1395 -0.1949 0.5744 0.0555 -182.8706 -197.0000 -2.6706 -2.6755
0.9509 0.9 3100 0.9483 -0.1450 -0.2020 0.5799 0.0570 -182.9411 -197.0551 -2.6574 -2.6625
0.9453 0.93 3200 0.9472 -0.1472 -0.2061 0.5834 0.0589 -182.9822 -197.0772 -2.6424 -2.6478
0.9577 0.96 3300 0.9461 -0.1490 -0.2081 0.5794 0.0590 -183.0018 -197.0956 -2.6570 -2.6622
0.9374 0.99 3400 0.9452 -0.1532 -0.2145 0.5770 0.0613 -183.0663 -197.1376 -2.6499 -2.6552
0.9299 1.02 3500 0.9439 -0.1570 -0.2195 0.5770 0.0625 -183.1160 -197.1755 -2.6612 -2.6663
0.936 1.05 3600 0.9438 -0.1628 -0.2265 0.5789 0.0637 -183.1864 -197.2330 -2.6532 -2.6584
0.9435 1.08 3700 0.9420 -0.1655 -0.2305 0.5807 0.0650 -183.2263 -197.2607 -2.6673 -2.6723
0.9341 1.11 3800 0.9422 -0.1698 -0.2351 0.5812 0.0653 -183.2721 -197.3029 -2.6585 -2.6636
0.9296 1.14 3900 0.9405 -0.1736 -0.2401 0.5714 0.0665 -183.3225 -197.3411 -2.6382 -2.6437
0.9338 1.17 4000 0.9402 -0.1747 -0.2426 0.5772 0.0680 -183.3476 -197.3519 -2.6428 -2.6483
0.9257 1.2 4100 0.9395 -0.1780 -0.2462 0.5766 0.0682 -183.3829 -197.3849 -2.6411 -2.6465
0.9368 1.23 4200 0.9386 -0.1786 -0.2485 0.5833 0.0699 -183.4063 -197.3914 -2.6495 -2.6548
0.916 1.25 4300 0.9385 -0.1812 -0.2513 0.5763 0.0702 -183.4345 -197.4169 -2.6390 -2.6445
0.9093 1.28 4400 0.9375 -0.1864 -0.2576 0.5831 0.0712 -183.4972 -197.4688 -2.6448 -2.6502
0.9408 1.31 4500 0.9368 -0.1896 -0.2615 0.5797 0.0719 -183.5364 -197.5016 -2.6422 -2.6476
0.9245 1.34 4600 0.9363 -0.1926 -0.2660 0.5787 0.0734 -183.5815 -197.5314 -2.6563 -2.6614
0.9469 1.37 4700 0.9364 -0.1944 -0.2666 0.5775 0.0722 -183.5875 -197.5493 -2.6581 -2.6632
0.9421 1.4 4800 0.9358 -0.1946 -0.2683 0.5819 0.0736 -183.6040 -197.5517 -2.6640 -2.6691
0.9076 1.43 4900 0.9356 -0.1963 -0.2704 0.5799 0.0741 -183.6253 -197.5680 -2.6626 -2.6676
0.94 1.46 5000 0.9353 -0.1996 -0.2738 0.5800 0.0742 -183.6591 -197.6010 -2.6438 -2.6492
0.9288 1.49 5100 0.9351 -0.1999 -0.2741 0.5809 0.0742 -183.6625 -197.6045 -2.6433 -2.6487
0.927 1.52 5200 0.9343 -0.2009 -0.2767 0.5821 0.0758 -183.6883 -197.6144 -2.6499 -2.6552
0.9171 1.55 5300 0.9339 -0.2024 -0.2784 0.5823 0.0760 -183.7055 -197.6292 -2.6421 -2.6476
0.9337 1.58 5400 0.9344 -0.2040 -0.2799 0.5787 0.0760 -183.7208 -197.6453 -2.6459 -2.6513
0.919 1.6 5500 0.9334 -0.2058 -0.2825 0.5811 0.0767 -183.7465 -197.6637 -2.6390 -2.6445
0.9297 1.63 5600 0.9341 -0.2053 -0.2822 0.5794 0.0770 -183.7437 -197.6582 -2.6418 -2.6472
0.9174 1.66 5700 0.9333 -0.2067 -0.2834 0.5800 0.0767 -183.7554 -197.6726 -2.6492 -2.6545
0.9275 1.69 5800 0.9332 -0.2059 -0.2826 0.5760 0.0767 -183.7476 -197.6642 -2.6471 -2.6524
0.9164 1.72 5900 0.9321 -0.2079 -0.2867 0.5809 0.0787 -183.7881 -197.6847 -2.6387 -2.6442
0.9218 1.75 6000 0.9322 -0.2095 -0.2872 0.5787 0.0777 -183.7935 -197.7004 -2.6377 -2.6432
0.944 1.78 6100 0.9319 -0.2106 -0.2895 0.5823 0.0789 -183.8163 -197.7118 -2.6555 -2.6607
0.9037 1.81 6200 0.9323 -0.2105 -0.2892 0.5780 0.0787 -183.8135 -197.7102 -2.6459 -2.6513
0.929 1.84 6300 0.9321 -0.2114 -0.2905 0.5773 0.0791 -183.8265 -197.7195 -2.6446 -2.6500
0.9091 1.87 6400 0.9324 -0.2111 -0.2904 0.5760 0.0793 -183.8252 -197.7167 -2.6534 -2.6586
0.9094 1.9 6500 0.9321 -0.2123 -0.2903 0.5770 0.0780 -183.8242 -197.7287 -2.6477 -2.6530
0.9449 1.93 6600 0.9320 -0.2120 -0.2903 0.5795 0.0784 -183.8246 -197.7251 -2.6302 -2.6358
0.9404 1.95 6700 0.9319 -0.2115 -0.2909 0.5802 0.0794 -183.8302 -197.7204 -2.6443 -2.6497
0.9155 1.98 6800 0.9314 -0.2124 -0.2919 0.5826 0.0795 -183.8406 -197.7291 -2.6306 -2.6362
0.9328 2.01 6900 0.9313 -0.2127 -0.2924 0.5884 0.0798 -183.8456 -197.7321 -2.6296 -2.6352
0.9012 2.04 7000 0.9321 -0.2146 -0.2932 0.5766 0.0785 -183.8530 -197.7515 -2.6361 -2.6416
0.9296 2.07 7100 0.9314 -0.2127 -0.2929 0.5780 0.0802 -183.8507 -197.7323 -2.6460 -2.6513
0.9076 2.1 7200 0.9315 -0.2145 -0.2945 0.5797 0.0799 -183.8660 -197.7507 -2.6501 -2.6554
0.922 2.13 7300 0.9315 -0.2147 -0.2935 0.5792 0.0788 -183.8565 -197.7523 -2.6510 -2.6562
0.9136 2.16 7400 0.9313 -0.2146 -0.2941 0.5819 0.0795 -183.8625 -197.7515 -2.6410 -2.6464
0.9401 2.19 7500 0.9314 -0.2140 -0.2937 0.5799 0.0797 -183.8583 -197.7451 -2.6490 -2.6543
0.9295 2.22 7600 0.9313 -0.2153 -0.2953 0.5812 0.0800 -183.8747 -197.7585 -2.6569 -2.6620
0.9128 2.25 7700 0.9309 -0.2154 -0.2960 0.5817 0.0806 -183.8814 -197.7590 -2.6500 -2.6553
0.9074 2.28 7800 0.9312 -0.2159 -0.2964 0.5836 0.0804 -183.8851 -197.7648 -2.6505 -2.6557
0.9114 2.3 7900 0.9310 -0.2149 -0.2949 0.5836 0.0800 -183.8703 -197.7544 -2.6425 -2.6479
0.9181 2.33 8000 0.9318 -0.2145 -0.2937 0.5772 0.0792 -183.8585 -197.7501 -2.6611 -2.6661
0.9009 2.36 8100 0.9311 -0.2149 -0.2952 0.5799 0.0803 -183.8736 -197.7543 -2.6581 -2.6632
0.9091 2.39 8200 0.9311 -0.2165 -0.2960 0.5829 0.0795 -183.8816 -197.7702 -2.6378 -2.6433
0.9091 2.42 8300 0.9312 -0.2146 -0.2950 0.5833 0.0805 -183.8717 -197.7510 -2.6475 -2.6528
0.9419 2.45 8400 0.9307 -0.2138 -0.2946 0.5777 0.0808 -183.8678 -197.7433 -2.6364 -2.6419
0.9203 2.48 8500 0.9313 -0.2148 -0.2948 0.5834 0.0800 -183.8688 -197.7529 -2.6474 -2.6527
0.9102 2.51 8600 0.9315 -0.2158 -0.2958 0.5821 0.0800 -183.8791 -197.7635 -2.6436 -2.6489
0.9327 2.54 8700 0.9316 -0.2146 -0.2946 0.5824 0.0800 -183.8669 -197.7511 -2.6505 -2.6558
0.9221 2.57 8800 0.9305 -0.2149 -0.2953 0.5828 0.0804 -183.8742 -197.7540 -2.6659 -2.6709
0.8851 2.6 8900 0.9315 -0.2146 -0.2949 0.5816 0.0803 -183.8702 -197.7508 -2.6571 -2.6622
0.924 2.63 9000 0.9304 -0.2144 -0.2951 0.5804 0.0807 -183.8718 -197.7492 -2.6449 -2.6503
0.9025 2.65 9100 0.9315 -0.2150 -0.2950 0.5790 0.0800 -183.8715 -197.7551 -2.6410 -2.6464
0.9348 2.68 9200 0.9308 -0.2144 -0.2946 0.5802 0.0802 -183.8669 -197.7491 -2.6349 -2.6405
0.9067 2.71 9300 0.9312 -0.2155 -0.2959 0.5857 0.0805 -183.8805 -197.7599 -2.6410 -2.6465
0.9263 2.74 9400 0.9307 -0.2148 -0.2957 0.5829 0.0809 -183.8785 -197.7536 -2.6432 -2.6486
0.912 2.77 9500 0.9306 -0.2153 -0.2957 0.5823 0.0805 -183.8788 -197.7581 -2.6441 -2.6495
0.9157 2.8 9600 0.9314 -0.2169 -0.2965 0.5785 0.0795 -183.8859 -197.7745 -2.6439 -2.6493
0.9094 2.83 9700 0.9309 -0.2157 -0.2961 0.5831 0.0804 -183.8826 -197.7625 -2.6441 -2.6494
0.9256 2.86 9800 0.9304 -0.2160 -0.2965 0.5838 0.0805 -183.8867 -197.7653 -2.6439 -2.6493
0.9287 2.89 9900 0.9305 -0.2149 -0.2955 0.5833 0.0806 -183.8762 -197.7545 -2.6440 -2.6494
0.9296 2.92 10000 0.9310 -0.2157 -0.2953 0.5795 0.0796 -183.8741 -197.7621 -2.6439 -2.6493
0.9335 2.95 10100 0.9311 -0.2153 -0.2953 0.5812 0.0800 -183.8739 -197.7578 -2.6439 -2.6493
0.9321 2.98 10200 0.9305 -0.2149 -0.2955 0.5824 0.0805 -183.8759 -197.7545 -2.6439 -2.6493

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.6
  • Tokenizers 0.15.0