LlamaCorn-1.1B-Chat / README.md
jan-hq's picture
Model save
5b9d863 verified
|
raw
history blame
No virus
21.5 kB
---
tags:
- trl
- dpo
- generated_from_trainer
model-index:
- name: LlamaCorn-1.1B-Chat
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# LlamaCorn-1.1B-Chat
This model was trained from scratch on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.9305
- Rewards/chosen: -0.2148
- Rewards/rejected: -0.2954
- Rewards/accuracies: 0.5824
- Rewards/margins: 0.0806
- Logps/rejected: -183.8757
- Logps/chosen: -197.7534
- Logits/rejected: -2.6439
- Logits/chosen: -2.6493
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-07
- train_batch_size: 2
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- total_eval_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:-----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.9958 | 0.03 | 100 | 1.0003 | -0.0002 | -0.0002 | 0.4930 | -0.0001 | -180.9232 | -195.6078 | -2.6876 | -2.6924 |
| 0.9984 | 0.06 | 200 | 0.9995 | -0.0007 | -0.0013 | 0.4988 | 0.0006 | -180.9347 | -195.6127 | -2.6787 | -2.6838 |
| 0.9982 | 0.09 | 300 | 0.9997 | -0.0008 | -0.0015 | 0.4983 | 0.0007 | -180.9361 | -195.6136 | -2.6848 | -2.6897 |
| 0.9966 | 0.12 | 400 | 0.9999 | -0.0024 | -0.0027 | 0.4995 | 0.0003 | -180.9485 | -195.6291 | -2.6865 | -2.6914 |
| 0.9992 | 0.15 | 500 | 0.9984 | -0.0039 | -0.0054 | 0.5122 | 0.0015 | -180.9753 | -195.6440 | -2.6641 | -2.6694 |
| 0.9983 | 0.18 | 600 | 0.9981 | -0.0054 | -0.0073 | 0.5127 | 0.0020 | -180.9945 | -195.6589 | -2.6862 | -2.6911 |
| 0.9968 | 0.2 | 700 | 0.9972 | -0.0093 | -0.0127 | 0.5241 | 0.0034 | -181.0485 | -195.6985 | -2.6753 | -2.6803 |
| 0.9893 | 0.23 | 800 | 0.9951 | -0.0114 | -0.0164 | 0.5248 | 0.0051 | -181.0858 | -195.7188 | -2.6676 | -2.6728 |
| 0.988 | 0.26 | 900 | 0.9924 | -0.0169 | -0.0245 | 0.5421 | 0.0076 | -181.1663 | -195.7744 | -2.6763 | -2.6814 |
| 0.9879 | 0.29 | 1000 | 0.9907 | -0.0220 | -0.0318 | 0.5476 | 0.0098 | -181.2388 | -195.8248 | -2.6746 | -2.6796 |
| 0.9882 | 0.32 | 1100 | 0.9869 | -0.0261 | -0.0399 | 0.5598 | 0.0138 | -181.3200 | -195.8661 | -2.6647 | -2.6699 |
| 0.979 | 0.35 | 1200 | 0.9851 | -0.0364 | -0.0521 | 0.5563 | 0.0157 | -181.4419 | -195.9693 | -2.6684 | -2.6735 |
| 0.985 | 0.38 | 1300 | 0.9818 | -0.0385 | -0.0576 | 0.5608 | 0.0192 | -181.4978 | -195.9900 | -2.6874 | -2.6921 |
| 0.9821 | 0.41 | 1400 | 0.9805 | -0.0462 | -0.0668 | 0.5590 | 0.0206 | -181.5891 | -196.0672 | -2.6761 | -2.6810 |
| 0.9822 | 0.44 | 1500 | 0.9779 | -0.0550 | -0.0777 | 0.5632 | 0.0227 | -181.6983 | -196.1554 | -2.6764 | -2.6813 |
| 0.9755 | 0.47 | 1600 | 0.9756 | -0.0600 | -0.0855 | 0.5656 | 0.0255 | -181.7764 | -196.2058 | -2.6502 | -2.6557 |
| 0.9697 | 0.5 | 1700 | 0.9731 | -0.0652 | -0.0931 | 0.5651 | 0.0280 | -181.8526 | -196.2569 | -2.6752 | -2.6801 |
| 0.969 | 0.53 | 1800 | 0.9698 | -0.0701 | -0.1017 | 0.5687 | 0.0315 | -181.9380 | -196.3067 | -2.6635 | -2.6687 |
| 0.9643 | 0.55 | 1900 | 0.9685 | -0.0762 | -0.1092 | 0.5676 | 0.0331 | -182.0137 | -196.3669 | -2.6590 | -2.6642 |
| 0.9655 | 0.58 | 2000 | 0.9663 | -0.0821 | -0.1180 | 0.5756 | 0.0359 | -182.1012 | -196.4265 | -2.6802 | -2.6850 |
| 0.9719 | 0.61 | 2100 | 0.9645 | -0.0908 | -0.1281 | 0.5676 | 0.0373 | -182.2023 | -196.5133 | -2.6677 | -2.6727 |
| 0.9576 | 0.64 | 2200 | 0.9625 | -0.0953 | -0.1350 | 0.5729 | 0.0396 | -182.2709 | -196.5585 | -2.6679 | -2.6730 |
| 0.9619 | 0.67 | 2300 | 0.9603 | -0.1012 | -0.1436 | 0.5783 | 0.0424 | -182.3572 | -196.6170 | -2.6527 | -2.6580 |
| 0.9511 | 0.7 | 2400 | 0.9601 | -0.1105 | -0.1540 | 0.5722 | 0.0434 | -182.4612 | -196.7107 | -2.6565 | -2.6617 |
| 0.9516 | 0.73 | 2500 | 0.9570 | -0.1158 | -0.1618 | 0.5715 | 0.0460 | -182.5389 | -196.7630 | -2.6613 | -2.6664 |
| 0.9577 | 0.76 | 2600 | 0.9554 | -0.1236 | -0.1717 | 0.5719 | 0.0481 | -182.6387 | -196.8413 | -2.6595 | -2.6646 |
| 0.9471 | 0.79 | 2700 | 0.9541 | -0.1268 | -0.1763 | 0.5736 | 0.0495 | -182.6840 | -196.8731 | -2.6621 | -2.6672 |
| 0.9519 | 0.82 | 2800 | 0.9524 | -0.1336 | -0.1849 | 0.5738 | 0.0513 | -182.7705 | -196.9414 | -2.6762 | -2.6810 |
| 0.9522 | 0.85 | 2900 | 0.9515 | -0.1364 | -0.1896 | 0.5724 | 0.0531 | -182.8170 | -196.9696 | -2.6604 | -2.6655 |
| 0.9414 | 0.88 | 3000 | 0.9491 | -0.1395 | -0.1949 | 0.5744 | 0.0555 | -182.8706 | -197.0000 | -2.6706 | -2.6755 |
| 0.9509 | 0.9 | 3100 | 0.9483 | -0.1450 | -0.2020 | 0.5799 | 0.0570 | -182.9411 | -197.0551 | -2.6574 | -2.6625 |
| 0.9453 | 0.93 | 3200 | 0.9472 | -0.1472 | -0.2061 | 0.5834 | 0.0589 | -182.9822 | -197.0772 | -2.6424 | -2.6478 |
| 0.9577 | 0.96 | 3300 | 0.9461 | -0.1490 | -0.2081 | 0.5794 | 0.0590 | -183.0018 | -197.0956 | -2.6570 | -2.6622 |
| 0.9374 | 0.99 | 3400 | 0.9452 | -0.1532 | -0.2145 | 0.5770 | 0.0613 | -183.0663 | -197.1376 | -2.6499 | -2.6552 |
| 0.9299 | 1.02 | 3500 | 0.9439 | -0.1570 | -0.2195 | 0.5770 | 0.0625 | -183.1160 | -197.1755 | -2.6612 | -2.6663 |
| 0.936 | 1.05 | 3600 | 0.9438 | -0.1628 | -0.2265 | 0.5789 | 0.0637 | -183.1864 | -197.2330 | -2.6532 | -2.6584 |
| 0.9435 | 1.08 | 3700 | 0.9420 | -0.1655 | -0.2305 | 0.5807 | 0.0650 | -183.2263 | -197.2607 | -2.6673 | -2.6723 |
| 0.9341 | 1.11 | 3800 | 0.9422 | -0.1698 | -0.2351 | 0.5812 | 0.0653 | -183.2721 | -197.3029 | -2.6585 | -2.6636 |
| 0.9296 | 1.14 | 3900 | 0.9405 | -0.1736 | -0.2401 | 0.5714 | 0.0665 | -183.3225 | -197.3411 | -2.6382 | -2.6437 |
| 0.9338 | 1.17 | 4000 | 0.9402 | -0.1747 | -0.2426 | 0.5772 | 0.0680 | -183.3476 | -197.3519 | -2.6428 | -2.6483 |
| 0.9257 | 1.2 | 4100 | 0.9395 | -0.1780 | -0.2462 | 0.5766 | 0.0682 | -183.3829 | -197.3849 | -2.6411 | -2.6465 |
| 0.9368 | 1.23 | 4200 | 0.9386 | -0.1786 | -0.2485 | 0.5833 | 0.0699 | -183.4063 | -197.3914 | -2.6495 | -2.6548 |
| 0.916 | 1.25 | 4300 | 0.9385 | -0.1812 | -0.2513 | 0.5763 | 0.0702 | -183.4345 | -197.4169 | -2.6390 | -2.6445 |
| 0.9093 | 1.28 | 4400 | 0.9375 | -0.1864 | -0.2576 | 0.5831 | 0.0712 | -183.4972 | -197.4688 | -2.6448 | -2.6502 |
| 0.9408 | 1.31 | 4500 | 0.9368 | -0.1896 | -0.2615 | 0.5797 | 0.0719 | -183.5364 | -197.5016 | -2.6422 | -2.6476 |
| 0.9245 | 1.34 | 4600 | 0.9363 | -0.1926 | -0.2660 | 0.5787 | 0.0734 | -183.5815 | -197.5314 | -2.6563 | -2.6614 |
| 0.9469 | 1.37 | 4700 | 0.9364 | -0.1944 | -0.2666 | 0.5775 | 0.0722 | -183.5875 | -197.5493 | -2.6581 | -2.6632 |
| 0.9421 | 1.4 | 4800 | 0.9358 | -0.1946 | -0.2683 | 0.5819 | 0.0736 | -183.6040 | -197.5517 | -2.6640 | -2.6691 |
| 0.9076 | 1.43 | 4900 | 0.9356 | -0.1963 | -0.2704 | 0.5799 | 0.0741 | -183.6253 | -197.5680 | -2.6626 | -2.6676 |
| 0.94 | 1.46 | 5000 | 0.9353 | -0.1996 | -0.2738 | 0.5800 | 0.0742 | -183.6591 | -197.6010 | -2.6438 | -2.6492 |
| 0.9288 | 1.49 | 5100 | 0.9351 | -0.1999 | -0.2741 | 0.5809 | 0.0742 | -183.6625 | -197.6045 | -2.6433 | -2.6487 |
| 0.927 | 1.52 | 5200 | 0.9343 | -0.2009 | -0.2767 | 0.5821 | 0.0758 | -183.6883 | -197.6144 | -2.6499 | -2.6552 |
| 0.9171 | 1.55 | 5300 | 0.9339 | -0.2024 | -0.2784 | 0.5823 | 0.0760 | -183.7055 | -197.6292 | -2.6421 | -2.6476 |
| 0.9337 | 1.58 | 5400 | 0.9344 | -0.2040 | -0.2799 | 0.5787 | 0.0760 | -183.7208 | -197.6453 | -2.6459 | -2.6513 |
| 0.919 | 1.6 | 5500 | 0.9334 | -0.2058 | -0.2825 | 0.5811 | 0.0767 | -183.7465 | -197.6637 | -2.6390 | -2.6445 |
| 0.9297 | 1.63 | 5600 | 0.9341 | -0.2053 | -0.2822 | 0.5794 | 0.0770 | -183.7437 | -197.6582 | -2.6418 | -2.6472 |
| 0.9174 | 1.66 | 5700 | 0.9333 | -0.2067 | -0.2834 | 0.5800 | 0.0767 | -183.7554 | -197.6726 | -2.6492 | -2.6545 |
| 0.9275 | 1.69 | 5800 | 0.9332 | -0.2059 | -0.2826 | 0.5760 | 0.0767 | -183.7476 | -197.6642 | -2.6471 | -2.6524 |
| 0.9164 | 1.72 | 5900 | 0.9321 | -0.2079 | -0.2867 | 0.5809 | 0.0787 | -183.7881 | -197.6847 | -2.6387 | -2.6442 |
| 0.9218 | 1.75 | 6000 | 0.9322 | -0.2095 | -0.2872 | 0.5787 | 0.0777 | -183.7935 | -197.7004 | -2.6377 | -2.6432 |
| 0.944 | 1.78 | 6100 | 0.9319 | -0.2106 | -0.2895 | 0.5823 | 0.0789 | -183.8163 | -197.7118 | -2.6555 | -2.6607 |
| 0.9037 | 1.81 | 6200 | 0.9323 | -0.2105 | -0.2892 | 0.5780 | 0.0787 | -183.8135 | -197.7102 | -2.6459 | -2.6513 |
| 0.929 | 1.84 | 6300 | 0.9321 | -0.2114 | -0.2905 | 0.5773 | 0.0791 | -183.8265 | -197.7195 | -2.6446 | -2.6500 |
| 0.9091 | 1.87 | 6400 | 0.9324 | -0.2111 | -0.2904 | 0.5760 | 0.0793 | -183.8252 | -197.7167 | -2.6534 | -2.6586 |
| 0.9094 | 1.9 | 6500 | 0.9321 | -0.2123 | -0.2903 | 0.5770 | 0.0780 | -183.8242 | -197.7287 | -2.6477 | -2.6530 |
| 0.9449 | 1.93 | 6600 | 0.9320 | -0.2120 | -0.2903 | 0.5795 | 0.0784 | -183.8246 | -197.7251 | -2.6302 | -2.6358 |
| 0.9404 | 1.95 | 6700 | 0.9319 | -0.2115 | -0.2909 | 0.5802 | 0.0794 | -183.8302 | -197.7204 | -2.6443 | -2.6497 |
| 0.9155 | 1.98 | 6800 | 0.9314 | -0.2124 | -0.2919 | 0.5826 | 0.0795 | -183.8406 | -197.7291 | -2.6306 | -2.6362 |
| 0.9328 | 2.01 | 6900 | 0.9313 | -0.2127 | -0.2924 | 0.5884 | 0.0798 | -183.8456 | -197.7321 | -2.6296 | -2.6352 |
| 0.9012 | 2.04 | 7000 | 0.9321 | -0.2146 | -0.2932 | 0.5766 | 0.0785 | -183.8530 | -197.7515 | -2.6361 | -2.6416 |
| 0.9296 | 2.07 | 7100 | 0.9314 | -0.2127 | -0.2929 | 0.5780 | 0.0802 | -183.8507 | -197.7323 | -2.6460 | -2.6513 |
| 0.9076 | 2.1 | 7200 | 0.9315 | -0.2145 | -0.2945 | 0.5797 | 0.0799 | -183.8660 | -197.7507 | -2.6501 | -2.6554 |
| 0.922 | 2.13 | 7300 | 0.9315 | -0.2147 | -0.2935 | 0.5792 | 0.0788 | -183.8565 | -197.7523 | -2.6510 | -2.6562 |
| 0.9136 | 2.16 | 7400 | 0.9313 | -0.2146 | -0.2941 | 0.5819 | 0.0795 | -183.8625 | -197.7515 | -2.6410 | -2.6464 |
| 0.9401 | 2.19 | 7500 | 0.9314 | -0.2140 | -0.2937 | 0.5799 | 0.0797 | -183.8583 | -197.7451 | -2.6490 | -2.6543 |
| 0.9295 | 2.22 | 7600 | 0.9313 | -0.2153 | -0.2953 | 0.5812 | 0.0800 | -183.8747 | -197.7585 | -2.6569 | -2.6620 |
| 0.9128 | 2.25 | 7700 | 0.9309 | -0.2154 | -0.2960 | 0.5817 | 0.0806 | -183.8814 | -197.7590 | -2.6500 | -2.6553 |
| 0.9074 | 2.28 | 7800 | 0.9312 | -0.2159 | -0.2964 | 0.5836 | 0.0804 | -183.8851 | -197.7648 | -2.6505 | -2.6557 |
| 0.9114 | 2.3 | 7900 | 0.9310 | -0.2149 | -0.2949 | 0.5836 | 0.0800 | -183.8703 | -197.7544 | -2.6425 | -2.6479 |
| 0.9181 | 2.33 | 8000 | 0.9318 | -0.2145 | -0.2937 | 0.5772 | 0.0792 | -183.8585 | -197.7501 | -2.6611 | -2.6661 |
| 0.9009 | 2.36 | 8100 | 0.9311 | -0.2149 | -0.2952 | 0.5799 | 0.0803 | -183.8736 | -197.7543 | -2.6581 | -2.6632 |
| 0.9091 | 2.39 | 8200 | 0.9311 | -0.2165 | -0.2960 | 0.5829 | 0.0795 | -183.8816 | -197.7702 | -2.6378 | -2.6433 |
| 0.9091 | 2.42 | 8300 | 0.9312 | -0.2146 | -0.2950 | 0.5833 | 0.0805 | -183.8717 | -197.7510 | -2.6475 | -2.6528 |
| 0.9419 | 2.45 | 8400 | 0.9307 | -0.2138 | -0.2946 | 0.5777 | 0.0808 | -183.8678 | -197.7433 | -2.6364 | -2.6419 |
| 0.9203 | 2.48 | 8500 | 0.9313 | -0.2148 | -0.2948 | 0.5834 | 0.0800 | -183.8688 | -197.7529 | -2.6474 | -2.6527 |
| 0.9102 | 2.51 | 8600 | 0.9315 | -0.2158 | -0.2958 | 0.5821 | 0.0800 | -183.8791 | -197.7635 | -2.6436 | -2.6489 |
| 0.9327 | 2.54 | 8700 | 0.9316 | -0.2146 | -0.2946 | 0.5824 | 0.0800 | -183.8669 | -197.7511 | -2.6505 | -2.6558 |
| 0.9221 | 2.57 | 8800 | 0.9305 | -0.2149 | -0.2953 | 0.5828 | 0.0804 | -183.8742 | -197.7540 | -2.6659 | -2.6709 |
| 0.8851 | 2.6 | 8900 | 0.9315 | -0.2146 | -0.2949 | 0.5816 | 0.0803 | -183.8702 | -197.7508 | -2.6571 | -2.6622 |
| 0.924 | 2.63 | 9000 | 0.9304 | -0.2144 | -0.2951 | 0.5804 | 0.0807 | -183.8718 | -197.7492 | -2.6449 | -2.6503 |
| 0.9025 | 2.65 | 9100 | 0.9315 | -0.2150 | -0.2950 | 0.5790 | 0.0800 | -183.8715 | -197.7551 | -2.6410 | -2.6464 |
| 0.9348 | 2.68 | 9200 | 0.9308 | -0.2144 | -0.2946 | 0.5802 | 0.0802 | -183.8669 | -197.7491 | -2.6349 | -2.6405 |
| 0.9067 | 2.71 | 9300 | 0.9312 | -0.2155 | -0.2959 | 0.5857 | 0.0805 | -183.8805 | -197.7599 | -2.6410 | -2.6465 |
| 0.9263 | 2.74 | 9400 | 0.9307 | -0.2148 | -0.2957 | 0.5829 | 0.0809 | -183.8785 | -197.7536 | -2.6432 | -2.6486 |
| 0.912 | 2.77 | 9500 | 0.9306 | -0.2153 | -0.2957 | 0.5823 | 0.0805 | -183.8788 | -197.7581 | -2.6441 | -2.6495 |
| 0.9157 | 2.8 | 9600 | 0.9314 | -0.2169 | -0.2965 | 0.5785 | 0.0795 | -183.8859 | -197.7745 | -2.6439 | -2.6493 |
| 0.9094 | 2.83 | 9700 | 0.9309 | -0.2157 | -0.2961 | 0.5831 | 0.0804 | -183.8826 | -197.7625 | -2.6441 | -2.6494 |
| 0.9256 | 2.86 | 9800 | 0.9304 | -0.2160 | -0.2965 | 0.5838 | 0.0805 | -183.8867 | -197.7653 | -2.6439 | -2.6493 |
| 0.9287 | 2.89 | 9900 | 0.9305 | -0.2149 | -0.2955 | 0.5833 | 0.0806 | -183.8762 | -197.7545 | -2.6440 | -2.6494 |
| 0.9296 | 2.92 | 10000 | 0.9310 | -0.2157 | -0.2953 | 0.5795 | 0.0796 | -183.8741 | -197.7621 | -2.6439 | -2.6493 |
| 0.9335 | 2.95 | 10100 | 0.9311 | -0.2153 | -0.2953 | 0.5812 | 0.0800 | -183.8739 | -197.7578 | -2.6439 | -2.6493 |
| 0.9321 | 2.98 | 10200 | 0.9305 | -0.2149 | -0.2955 | 0.5824 | 0.0805 | -183.8759 | -197.7545 | -2.6439 | -2.6493 |
### Framework versions
- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.14.6
- Tokenizers 0.15.0