Model save

Browse files

Files changed (7) hide show

README.md +113 -0
all_results.json +9 -0
generation_config.json +7 -0
model.safetensors +1 -1
runs/Jun08_12-33-54_poseidon/events.out.tfevents.1717850367.poseidon.3992514.0 +2 -2
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,113 @@

+---
+license: apache-2.0
+base_model: martimfasantos/tinyllama-1.1b-sum-sft-full
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: tinyllama-1.1b-sum-dpo-full_LR2e-7_3epochs
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# tinyllama-1.1b-sum-dpo-full_LR2e-7_3epochs
+This model is a fine-tuned version of [martimfasantos/tinyllama-1.1b-sum-sft-full](https://huggingface.co/martimfasantos/tinyllama-1.1b-sum-sft-full) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.6412
+- Rewards/chosen: -1.5933
+- Rewards/rejected: -1.9040
+- Rewards/accuracies: 0.6296
+- Rewards/margins: 0.3107
+- Logps/rejected: -253.1478
+- Logps/chosen: -218.3473
+- Logits/rejected: -2.1506
+- Logits/chosen: -2.1702
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-07
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch  | Step  | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:-----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6924        | 0.0689 | 400   | 0.6930          | 0.0011         | 0.0007           | 0.5390             | 0.0003          | -62.6755       | -58.9094     | -2.9687         | -2.9723       |
+| 0.6891        | 0.1378 | 800   | 0.6909          | -0.0061        | -0.0108          | 0.5748             | 0.0047          | -63.8305       | -59.6239     | -2.9588         | -2.9622       |
+| 0.6874        | 0.2068 | 1200  | 0.6876          | -0.0302        | -0.0427          | 0.5871             | 0.0124          | -67.0173       | -62.0385     | -2.9361         | -2.9395       |
+| 0.676         | 0.2757 | 1600  | 0.6820          | -0.1057        | -0.1316          | 0.5850             | 0.0259          | -75.9065       | -69.5813     | -2.8942         | -2.8976       |
+| 0.6751        | 0.3446 | 2000  | 0.6770          | -0.1715        | -0.2098          | 0.5890             | 0.0384          | -83.7308       | -76.1611     | -2.8434         | -2.8468       |
+| 0.6518        | 0.4135 | 2400  | 0.6676          | -0.3727        | -0.4381          | 0.6069             | 0.0654          | -106.5637      | -96.2904     | -2.7893         | -2.7926       |
+| 0.6695        | 0.4824 | 2800  | 0.6631          | -0.4734        | -0.5560          | 0.6141             | 0.0826          | -118.3500      | -106.3523    | -2.7415         | -2.7450       |
+| 0.6467        | 0.5513 | 3200  | 0.6583          | -0.6700        | -0.7814          | 0.625              | 0.1113          | -140.8851      | -126.0199    | -2.6864         | -2.6902       |
+| 0.6264        | 0.6203 | 3600  | 0.6586          | -0.6359        | -0.7384          | 0.6106             | 0.1024          | -136.5857      | -122.6100    | -2.6176         | -2.6225       |
+| 0.6203        | 0.6892 | 4000  | 0.6523          | -0.7851        | -0.9183          | 0.6166             | 0.1332          | -154.5775      | -137.5248    | -2.5583         | -2.5642       |
+| 0.6341        | 0.7581 | 4400  | 0.6487          | -0.8786        | -1.0259          | 0.6129             | 0.1473          | -165.3377      | -146.8752    | -2.4643         | -2.4723       |
+| 0.6184        | 0.8270 | 4800  | 0.6454          | -1.0766        | -1.2481          | 0.6129             | 0.1716          | -187.5630      | -166.6730    | -2.4141         | -2.4242       |
+| 0.609         | 0.8959 | 5200  | 0.6414          | -0.9919        | -1.1678          | 0.6164             | 0.1759          | -179.5278      | -158.2066    | -2.3970         | -2.4080       |
+| 0.5977        | 0.9649 | 5600  | 0.6432          | -0.9166        | -1.0804          | 0.6273             | 0.1638          | -170.7888      | -150.6710    | -2.3933         | -2.4042       |
+| 0.5845        | 1.0338 | 6000  | 0.6438          | -1.3686        | -1.6032          | 0.6245             | 0.2346          | -223.0724      | -195.8758    | -2.2640         | -2.2816       |
+| 0.5789        | 1.1027 | 6400  | 0.6455          | -1.3882        | -1.6212          | 0.6164             | 0.2331          | -224.8725      | -197.8306    | -2.2428         | -2.2595       |
+| 0.5681        | 1.1716 | 6800  | 0.6434          | -1.3348        | -1.5500          | 0.6129             | 0.2153          | -217.7540      | -192.4917    | -2.2435         | -2.2593       |
+| 0.5602        | 1.2405 | 7200  | 0.6448          | -1.3673        | -1.5959          | 0.6234             | 0.2286          | -222.3391      | -195.7428    | -2.2210         | -2.2378       |
+| 0.6357        | 1.3094 | 7600  | 0.6413          | -1.3975        | -1.6344          | 0.6125             | 0.2368          | -226.1876      | -198.7702    | -2.2034         | -2.2208       |
+| 0.5491        | 1.3784 | 8000  | 0.6438          | -1.4655        | -1.7121          | 0.6055             | 0.2466          | -233.9599      | -205.5657    | -2.1906         | -2.2085       |
+| 0.5537        | 1.4473 | 8400  | 0.6445          | -1.4375        | -1.6793          | 0.6259             | 0.2418          | -230.6812      | -202.7634    | -2.1797         | -2.1984       |
+| 0.61          | 1.5162 | 8800  | 0.6405          | -1.0941        | -1.2946          | 0.6164             | 0.2005          | -192.2120      | -168.4266    | -2.2428         | -2.2579       |
+| 0.523         | 1.5851 | 9200  | 0.6431          | -1.4596        | -1.7029          | 0.6289             | 0.2433          | -233.0398      | -204.9723    | -2.1570         | -2.1756       |
+| 0.5412        | 1.6540 | 9600  | 0.6393          | -1.4228        | -1.6896          | 0.6315             | 0.2668          | -231.7097      | -201.2986    | -2.1513         | -2.1708       |
+| 0.5368        | 1.7229 | 10000 | 0.6408          | -1.3358        | -1.5858          | 0.6236             | 0.2500          | -221.3330      | -192.5947    | -2.1730         | -2.1915       |
+| 0.5064        | 1.7919 | 10400 | 0.6423          | -1.0625        | -1.2620          | 0.6215             | 0.1995          | -188.9488      | -165.2631    | -2.2150         | -2.2307       |
+| 0.5268        | 1.8608 | 10800 | 0.6406          | -1.4254        | -1.6829          | 0.6341             | 0.2575          | -231.0404      | -201.5558    | -2.1644         | -2.1831       |
+| 0.5384        | 1.9297 | 11200 | 0.6418          | -1.6486        | -1.9439          | 0.6364             | 0.2954          | -257.1440      | -223.8720    | -2.1299         | -2.1503       |
+| 0.5734        | 1.9986 | 11600 | 0.6378          | -1.4356        | -1.7101          | 0.6362             | 0.2744          | -233.7563      | -202.5782    | -2.1624         | -2.1813       |
+| 0.5302        | 2.0675 | 12000 | 0.6413          | -1.7064        | -2.0285          | 0.6292             | 0.3221          | -265.5970      | -229.6515    | -2.1257         | -2.1466       |
+| 0.4961        | 2.1365 | 12400 | 0.6474          | -2.0075        | -2.3712          | 0.6387             | 0.3637          | -299.8690      | -259.7696    | -2.0958         | -2.1178       |
+| 0.55          | 2.2054 | 12800 | 0.6415          | -1.5035        | -1.7868          | 0.6315             | 0.2833          | -241.4328      | -209.3660    | -2.1574         | -2.1761       |
+| 0.5546        | 2.2743 | 13200 | 0.6425          | -1.6715        | -1.9874          | 0.6303             | 0.3159          | -261.4859      | -226.1615    | -2.1413         | -2.1612       |
+| 0.5639        | 2.3432 | 13600 | 0.6409          | -1.5908        | -1.8980          | 0.6289             | 0.3072          | -252.5519      | -218.1001    | -2.1481         | -2.1675       |
+| 0.5055        | 2.4121 | 14000 | 0.6384          | -1.4618        | -1.7629          | 0.6257             | 0.3010          | -239.0347      | -205.1979    | -2.1665         | -2.1857       |
+| 0.5404        | 2.4810 | 14400 | 0.6405          | -1.6514        | -1.9790          | 0.6285             | 0.3276          | -260.6489      | -224.1589    | -2.1411         | -2.1613       |
+| 0.5348        | 2.5500 | 14800 | 0.6418          | -1.6812        | -2.0090          | 0.6276             | 0.3278          | -263.6481      | -227.1385    | -2.1375         | -2.1578       |
+| 0.5114        | 2.6189 | 15200 | 0.6408          | -1.5587        | -1.8632          | 0.6310             | 0.3046          | -249.0734      | -214.8810    | -2.1538         | -2.1732       |
+| 0.5356        | 2.6878 | 15600 | 0.6405          | -1.5493        | -1.8534          | 0.6266             | 0.3041          | -248.0918      | -213.9473    | -2.1550         | -2.1743       |
+| 0.4885        | 2.7567 | 16000 | 0.6406          | -1.5822        | -1.8916          | 0.6269             | 0.3094          | -251.9056      | -217.2328    | -2.1512         | -2.1707       |
+| 0.5057        | 2.8256 | 16400 | 0.6410          | -1.5799        | -1.8883          | 0.6306             | 0.3084          | -251.5751      | -217.0051    | -2.1527         | -2.1720       |
+| 0.5731        | 2.8946 | 16800 | 0.6412          | -1.5917        | -1.9021          | 0.6271             | 0.3104          | -252.9564      | -218.1854    | -2.1507         | -2.1702       |
+| 0.4958        | 2.9635 | 17200 | 0.6412          | -1.5933        | -1.9040          | 0.6296             | 0.3107          | -253.1478      | -218.3473    | -2.1506         | -2.1702       |
+### Framework versions
+- Transformers 4.41.2
+- Pytorch 2.1.2
+- Datasets 2.19.2
+- Tokenizers 0.19.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 3.0,
+    "total_flos": 0.0,
+    "train_loss": 0.5850592939272546,
+    "train_runtime": 88547.231,
+    "train_samples": 92858,
+    "train_samples_per_second": 3.146,
+    "train_steps_per_second": 0.197
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "max_length": 2048,
+  "pad_token_id": 0,
+  "transformers_version": "4.41.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:83f80042944ee8905c3f2f42b6b2c372cc53051eff576f8d9424776f6916d681
 size 4400216536

 version https://git-lfs.github.com/spec/v1
+oid sha256:3e17fbfda607dadf38de477615d5c88744c9f714c3b8a7aa77d423acdbc639ea
 size 4400216536

runs/Jun08_12-33-54_poseidon/events.out.tfevents.1717850367.poseidon.3992514.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:24a968e0862f22e41543c2bacb52e7c44917361c916d5c34aab0c6c70a732f6d
-size 1221875

 version https://git-lfs.github.com/spec/v1
+oid sha256:92965809c1e0cd43971dea7ae0bd17e78cb6082d550a34abd0299d3fcafc56d5
+size 1236935

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 3.0,
+    "total_flos": 0.0,
+    "train_loss": 0.5850592939272546,
+    "train_runtime": 88547.231,
+    "train_samples": 92858,
+    "train_samples_per_second": 3.146,
+    "train_steps_per_second": 0.197
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff