joshcarp
/

gpt2-evy

@@ -1,15 +1,11 @@
 ---
 license: mit
 tags:
 - generated_from_trainer
-base_model: gpt2
 model-index:
 - name: gpt2-evy
   results: []
-widget:
-- text: "func fizzbuzz:[]string n:num"
-- text: "func lengthOfLongestSubstring:num s:string"
-- text: "func fibonacci:num n:num"
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -17,9 +13,9 @@ should probably proofread and complete it, then remove this comment. -->
 # gpt2-evy
-This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.6979
 ## Model description
@@ -44,22 +40,117 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 1.0   | 19   | 1.6965          |
-| No log        | 2.0   | 38   | 1.6405          |
-| No log        | 3.0   | 57   | 1.6798          |
-| No log        | 4.0   | 76   | 1.6933          |
-| No log        | 5.0   | 95   | 1.6979          |
 ### Framework versions
 - Transformers 4.40.1
-- Pytorch 2.0.1+cu117
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 ---
 license: mit
+base_model: joshcarp/gpt2-evy
 tags:
 - generated_from_trainer
 model-index:
 - name: gpt2-evy
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # gpt2-evy
+This model is a fine-tuned version of [joshcarp/gpt2-evy](https://huggingface.co/joshcarp/gpt2-evy) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.3017
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 100
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 19   | 1.9775          |
+| No log        | 2.0   | 38   | 2.0492          |
+| No log        | 3.0   | 57   | 2.0643          |
+| No log        | 4.0   | 76   | 2.0102          |
+| No log        | 5.0   | 95   | 1.9851          |
+| 0.0382        | 6.0   | 114  | 1.9408          |
+| 0.0382        | 7.0   | 133  | 2.0678          |
+| 0.0382        | 8.0   | 152  | 2.0022          |
+| 0.0382        | 9.0   | 171  | 2.0112          |
+| 0.0382        | 10.0  | 190  | 2.0664          |
+| 0.0406        | 11.0  | 209  | 2.0646          |
+| 0.0406        | 12.0  | 228  | 2.0033          |
+| 0.0406        | 13.0  | 247  | 2.0783          |
+| 0.0406        | 14.0  | 266  | 2.0601          |
+| 0.0406        | 15.0  | 285  | 2.0172          |
+| 0.0381        | 16.0  | 304  | 2.0543          |
+| 0.0381        | 17.0  | 323  | 2.1309          |
+| 0.0381        | 18.0  | 342  | 2.0796          |
+| 0.0381        | 19.0  | 361  | 2.0456          |
+| 0.0381        | 20.0  | 380  | 2.1159          |
+| 0.0381        | 21.0  | 399  | 2.0979          |
+| 0.0324        | 22.0  | 418  | 2.0862          |
+| 0.0324        | 23.0  | 437  | 2.1196          |
+| 0.0324        | 24.0  | 456  | 2.1273          |
+| 0.0324        | 25.0  | 475  | 2.1860          |
+| 0.0324        | 26.0  | 494  | 2.0690          |
+| 0.0252        | 27.0  | 513  | 2.1172          |
+| 0.0252        | 28.0  | 532  | 2.1453          |
+| 0.0252        | 29.0  | 551  | 2.0990          |
+| 0.0252        | 30.0  | 570  | 2.1250          |
+| 0.0252        | 31.0  | 589  | 2.1265          |
+| 0.0256        | 32.0  | 608  | 2.1649          |
+| 0.0256        | 33.0  | 627  | 2.1238          |
+| 0.0256        | 34.0  | 646  | 2.1757          |
+| 0.0256        | 35.0  | 665  | 2.1402          |
+| 0.0256        | 36.0  | 684  | 2.1569          |
+| 0.0239        | 37.0  | 703  | 2.1783          |
+| 0.0239        | 38.0  | 722  | 2.1934          |
+| 0.0239        | 39.0  | 741  | 2.1883          |
+| 0.0239        | 40.0  | 760  | 2.1831          |
+| 0.0239        | 41.0  | 779  | 2.1931          |
+| 0.0239        | 42.0  | 798  | 2.1697          |
+| 0.0222        | 43.0  | 817  | 2.1629          |
+| 0.0222        | 44.0  | 836  | 2.2133          |
+| 0.0222        | 45.0  | 855  | 2.1917          |
+| 0.0222        | 46.0  | 874  | 2.1547          |
+| 0.0222        | 47.0  | 893  | 2.1826          |
+| 0.0208        | 48.0  | 912  | 2.2125          |
+| 0.0208        | 49.0  | 931  | 2.2529          |
+| 0.0208        | 50.0  | 950  | 2.2400          |
+| 0.0208        | 51.0  | 969  | 2.2202          |
+| 0.0208        | 52.0  | 988  | 2.2158          |
+| 0.017         | 53.0  | 1007 | 2.2075          |
+| 0.017         | 54.0  | 1026 | 2.2320          |
+| 0.017         | 55.0  | 1045 | 2.2138          |
+| 0.017         | 56.0  | 1064 | 2.2015          |
+| 0.017         | 57.0  | 1083 | 2.2282          |
+| 0.0169        | 58.0  | 1102 | 2.2591          |
+| 0.0169        | 59.0  | 1121 | 2.2483          |
+| 0.0169        | 60.0  | 1140 | 2.2194          |
+| 0.0169        | 61.0  | 1159 | 2.2324          |
+| 0.0169        | 62.0  | 1178 | 2.2649          |
+| 0.0169        | 63.0  | 1197 | 2.2763          |
+| 0.0145        | 64.0  | 1216 | 2.2876          |
+| 0.0145        | 65.0  | 1235 | 2.2645          |
+| 0.0145        | 66.0  | 1254 | 2.2531          |
+| 0.0145        | 67.0  | 1273 | 2.2532          |
+| 0.0145        | 68.0  | 1292 | 2.2602          |
+| 0.0155        | 69.0  | 1311 | 2.2659          |
+| 0.0155        | 70.0  | 1330 | 2.2894          |
+| 0.0155        | 71.0  | 1349 | 2.2998          |
+| 0.0155        | 72.0  | 1368 | 2.2802          |
+| 0.0155        | 73.0  | 1387 | 2.2498          |
+| 0.0148        | 74.0  | 1406 | 2.2602          |
+| 0.0148        | 75.0  | 1425 | 2.2659          |
+| 0.0148        | 76.0  | 1444 | 2.2762          |
+| 0.0148        | 77.0  | 1463 | 2.2825          |
+| 0.0148        | 78.0  | 1482 | 2.2939          |
+| 0.014         | 79.0  | 1501 | 2.3115          |
+| 0.014         | 80.0  | 1520 | 2.3008          |
+| 0.014         | 81.0  | 1539 | 2.3054          |
+| 0.014         | 82.0  | 1558 | 2.2930          |
+| 0.014         | 83.0  | 1577 | 2.2872          |
+| 0.014         | 84.0  | 1596 | 2.2896          |
+| 0.014         | 85.0  | 1615 | 2.2853          |
+| 0.014         | 86.0  | 1634 | 2.2755          |
+| 0.014         | 87.0  | 1653 | 2.2781          |
+| 0.014         | 88.0  | 1672 | 2.2842          |
+| 0.014         | 89.0  | 1691 | 2.2788          |
+| 0.0129        | 90.0  | 1710 | 2.2850          |
+| 0.0129        | 91.0  | 1729 | 2.2908          |
+| 0.0129        | 92.0  | 1748 | 2.2964          |
+| 0.0129        | 93.0  | 1767 | 2.2966          |
+| 0.0129        | 94.0  | 1786 | 2.2975          |
+| 0.0121        | 95.0  | 1805 | 2.3019          |
+| 0.0121        | 96.0  | 1824 | 2.2990          |
+| 0.0121        | 97.0  | 1843 | 2.2985          |
+| 0.0121        | 98.0  | 1862 | 2.3000          |
+| 0.0121        | 99.0  | 1881 | 2.3014          |
+| 0.0121        | 100.0 | 1900 | 2.3017          |
 ### Framework versions
 - Transformers 4.40.1
+- Pytorch 2.2.1+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6b1987779ebf7b2fa59e96fc33fb34ffada679f156d7dfcc566d6433648e0e7f
 size 497774208

 version https://git-lfs.github.com/spec/v1
+oid sha256:bfa70cc45a92e97e22c12dd9cb5d777857cdb53fb3bace769b0514cd3e96e4d3
 size 497774208