Luca-Engel commited on May 18

Commit

7f5fa4f

•

1 Parent(s): e0565ec

do test run on colab with base gpt mode

Browse files

Files changed (18) hide show

README.md +70 -71
config.json +39 -39
generation_config.json +6 -6
model.safetensors +1 -1
runs/May16_10-54-37_c181a1ecec5b/events.out.tfevents.1715857070.c181a1ecec5b.247.0 +3 -0
runs/May16_10-58-52_c181a1ecec5b/events.out.tfevents.1715857249.c181a1ecec5b.247.1 +3 -0
runs/May16_11-03-56_51e195417f1c/events.out.tfevents.1715857518.51e195417f1c.1843.0 +3 -0
runs/May18_13-33-40_586feedd8b82/events.out.tfevents.1716039306.586feedd8b82.2907.0 +3 -0
runs/May18_13-35-44_586feedd8b82/events.out.tfevents.1716039429.586feedd8b82.2907.1 +3 -0
runs/May18_13-44-27_7618a08b7f98/events.out.tfevents.1716039949.7618a08b7f98.6286.0 +3 -0
runs/May18_13-48-38_7618a08b7f98/events.out.tfevents.1716040130.7618a08b7f98.6286.1 +3 -0
runs/May18_13-53-10_1113de9dbcce/events.out.tfevents.1716040404.1113de9dbcce.6107.0 +3 -0
runs/May18_14-18-11_857c112c0c60/events.out.tfevents.1716041915.857c112c0c60.307.0 +3 -0
runs/May18_14-18-11_857c112c0c60/events.out.tfevents.1716043541.857c112c0c60.307.1 +3 -0
special_tokens_map.json +6 -6
tokenizer.json +1 -0
tokenizer_config.json +20 -20
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -1,71 +1,70 @@
----
-license: mit
-base_model: gpt2
-tags:
-- trl
-- dpo
-- generated_from_trainer
-model-index:
-- name: distilgpt2-dpo_test_run
-  results: []
----
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# distilgpt2-dpo_test_run
-This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.6931
-- Rewards/chosen: 0.0
-- Rewards/rejected: 0.0
-- Rewards/accuracies: 0.0
-- Rewards/margins: 0.0
-- Logps/rejected: -606.5995
-- Logps/chosen: -1121.9315
-- Logits/rejected: -132.4945
-- Logits/chosen: -148.9527
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 8
-- eval_batch_size: 8
-- seed: 42
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 16
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 3
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
-|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| No log        | 0.67  | 1    | 0.6931          | 0.0            | 0.0              | 0.0                | 0.0             | -606.5995      | -1121.9315   | -132.4945       | -148.9527     |
-| No log        | 2.0   | 3    | 4.0741          | 3.4192         | 4.4772           | 0.6000             | -1.0580         | -561.8280      | -1087.7399   | -119.7184       | -136.7517     |
-### Framework versions
-- Transformers 4.38.1
-- Pytorch 2.3.0+cpu
-- Datasets 2.3.2
-- Tokenizers 0.15.2

+---
+license: mit
+base_model: gpt2
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: distilgpt2-dpo_test_run
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# distilgpt2-dpo_test_run
+This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.0786
+- Rewards/chosen: -0.1353
+- Rewards/rejected: -0.5974
+- Rewards/accuracies: 0.5959
+- Rewards/margins: 0.4621
+- Logps/rejected: -493.6547
+- Logps/chosen: -559.9373
+- Logits/rejected: -82.4215
+- Logits/chosen: -80.3884
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| No log        | 1.0   | 289  | 1.0786          | -0.1353        | -0.5974          | 0.5959             | 0.4621          | -493.6547      | -559.9373    | -82.4215        | -80.3884      |
+| 0.7672        | 2.0   | 578  | 1.1977          | 1.1873         | 0.5208           | 0.5993             | 0.6665          | -482.4724      | -546.7113    | -89.5540        | -87.9300      |
+| 0.7672        | 3.0   | 867  | 1.4420          | 0.6108         | -0.0653          | 0.5788             | 0.6761          | -488.3335      | -552.4765    | -97.7897        | -96.8133      |
+### Framework versions
+- Transformers 4.40.2
+- Pytorch 2.2.1+cu121
+- Datasets 2.19.1
+- Tokenizers 0.19.1

config.json CHANGED Viewed

@@ -1,39 +1,39 @@
-{
-  "_name_or_path": "gpt2",
-  "activation_function": "gelu_new",
-  "architectures": [
-    "GPT2LMHeadModel"
-  ],
-  "attn_pdrop": 0.1,
-  "bos_token_id": 50256,
-  "embd_pdrop": 0.1,
-  "eos_token_id": 50256,
-  "initializer_range": 0.02,
-  "layer_norm_epsilon": 1e-05,
-  "model_type": "gpt2",
-  "n_ctx": 1024,
-  "n_embd": 768,
-  "n_head": 12,
-  "n_inner": null,
-  "n_layer": 12,
-  "n_positions": 1024,
-  "reorder_and_upcast_attn": false,
-  "resid_pdrop": 0.1,
-  "scale_attn_by_inverse_layer_idx": false,
-  "scale_attn_weights": true,
-  "summary_activation": null,
-  "summary_first_dropout": 0.1,
-  "summary_proj_to_labels": true,
-  "summary_type": "cls_index",
-  "summary_use_proj": true,
-  "task_specific_params": {
-    "text-generation": {
-      "do_sample": true,
-      "max_length": 50
-    }
-  },
-  "torch_dtype": "float32",
-  "transformers_version": "4.38.1",
-  "use_cache": true,
-  "vocab_size": 50257
-}

+{
+  "_name_or_path": "gpt2",
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 1024,
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 50
+    }
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.40.2",
+  "use_cache": true,
+  "vocab_size": 50257
+}

generation_config.json CHANGED Viewed

@@ -1,6 +1,6 @@
-{
-  "_from_model_config": true,
-  "bos_token_id": 50256,
-  "eos_token_id": 50256,
-  "transformers_version": "4.38.1"
-}

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.40.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f898bfc8b55dfe5e4431f298888973cb311f93698e1b75dcd178dafe3d80a82a
 size 497774208

 version https://git-lfs.github.com/spec/v1
+oid sha256:30ede28f2f06bdc930fc26a85e69c25f0f63a22bff0ec0932aaf67ab980c67dd
 size 497774208

runs/May16_10-54-37_c181a1ecec5b/events.out.tfevents.1715857070.c181a1ecec5b.247.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e9c1c451c7c98da663ef42508547b7c4a23abd9bd3913c4a356f83bc5854f9d
+size 4917

runs/May16_10-58-52_c181a1ecec5b/events.out.tfevents.1715857249.c181a1ecec5b.247.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1ae85b7f9deabe9992b2fbd6f7b46fa664864b2120a9dd8612d82cc36b3c4b7c
+size 88

runs/May16_11-03-56_51e195417f1c/events.out.tfevents.1715857518.51e195417f1c.1843.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cfe7ad14f2943e7b4226606e11127908fe52255903abfd8b010eec25731ff7e4
+size 4917

runs/May18_13-33-40_586feedd8b82/events.out.tfevents.1716039306.586feedd8b82.2907.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:af820a56159accfe4ebe8c15eca5de0ffccbb7102ac94813f31db7346c54f3c8
+size 4955

runs/May18_13-35-44_586feedd8b82/events.out.tfevents.1716039429.586feedd8b82.2907.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:75b0b22b1509972c0cc08460746aecc58dde108dba9914820a8d25af3990d730
+size 4954

runs/May18_13-44-27_7618a08b7f98/events.out.tfevents.1716039949.7618a08b7f98.6286.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b9aef911a07257a90b322c972ebbb3d32ccef59006dd15f0188cd31503521b8
+size 4954

runs/May18_13-48-38_7618a08b7f98/events.out.tfevents.1716040130.7618a08b7f98.6286.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c93f661cceb40fa4a09446dd942e83393bd27dfa33e858c2a9b04669ccee8260
+size 8280

runs/May18_13-53-10_1113de9dbcce/events.out.tfevents.1716040404.1113de9dbcce.6107.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6374c197a3c32e025028ccd03eb66c2f603778780360be1359497b332c2dddff
+size 6382

runs/May18_14-18-11_857c112c0c60/events.out.tfevents.1716041915.857c112c0c60.307.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:38ad33fdd24d4629b7b9fef52f84c5cceb888056c3e803b55e0b00423ec88723
+size 8216

runs/May18_14-18-11_857c112c0c60/events.out.tfevents.1716043541.857c112c0c60.307.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a8988c761aa8df45edbc32ec28812df2acb32c8596b5f1e71b8b9fab172162a3
+size 828

special_tokens_map.json CHANGED Viewed

@@ -1,6 +1,6 @@
-{
-  "bos_token": "<|endoftext|>",
-  "eos_token": "<|endoftext|>",
-  "pad_token": "<|endoftext|>",
-  "unk_token": "<|endoftext|>"
-}

+{
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "pad_token": "<|endoftext|>",
+  "unk_token": "<|endoftext|>"
+}

tokenizer.json CHANGED Viewed

@@ -40,6 +40,7 @@
     "end_of_word_suffix": "",
     "fuse_unk": false,
     "byte_fallback": false,
     "vocab": {
       "!": 0,
       "\"": 1,

     "end_of_word_suffix": "",
     "fuse_unk": false,
     "byte_fallback": false,
+    "ignore_merges": false,
     "vocab": {
       "!": 0,
       "\"": 1,

tokenizer_config.json CHANGED Viewed

@@ -1,20 +1,20 @@
-{
-  "add_prefix_space": false,
-  "added_tokens_decoder": {
-    "50256": {
-      "content": "<|endoftext|>",
-      "lstrip": false,
-      "normalized": true,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    }
-  },
-  "bos_token": "<|endoftext|>",
-  "clean_up_tokenization_spaces": true,
-  "eos_token": "<|endoftext|>",
-  "model_max_length": 1024,
-  "pad_token": "<|endoftext|>",
-  "tokenizer_class": "GPT2Tokenizer",
-  "unk_token": "<|endoftext|>"
-}

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "50256": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<|endoftext|>",
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "<|endoftext|>",
+  "model_max_length": 1024,
+  "pad_token": "<|endoftext|>",
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6e7c002ad0f6dbec31187867d72777d2d20d87717dba4dbbdcbe71d959f1f98c
-size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:e609589856f247c47efa96b0eaa75fe4585e470031d55d96d4e31f8ab162ec2e
+size 5048