Nbeau
/

GPT2-arithmetic-3digits

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Nbeau commited on May 16, 2024

Commit

d272f9b

·

verified ·

1 Parent(s): 884b5e7

End of training

Files changed (3) hide show

README.md +8 -7
generation_config.json +1 -1
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0735
 ## Model description
@@ -42,18 +42,19 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant_with_warmup
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.2855        | 1.0   | 125  | 0.1888          |
-| 0.1262        | 2.0   | 250  | 0.1072          |
-| 0.0917        | 3.0   | 375  | 0.0873          |
-| 0.0815        | 4.0   | 500  | 0.0761          |
-| 0.0785        | 5.0   | 625  | 0.0735          |
 ### Framework versions

 This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1254
 ## Model description
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant_with_warmup
+- num_epochs: 6
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.144         | 1.0   | 125  | 0.1199          |
+| 0.0896        | 2.0   | 250  | 0.0838          |
+| 0.0861        | 3.0   | 375  | 0.0792          |
+| 0.0781        | 4.0   | 500  | 0.0739          |
+| 0.0772        | 5.0   | 625  | 0.0753          |
+| 0.1549        | 6.0   | 750  | 0.1254          |
 ### Framework versions

generation_config.json CHANGED Viewed

@@ -2,6 +2,6 @@
   "_from_model_config": true,
   "bos_token_id": 1,
   "eos_token_id": 2,
-  "pad_token_id": 2,
   "transformers_version": "4.40.2"
 }

   "_from_model_config": true,
   "bos_token_id": 1,
   "eos_token_id": 2,
+  "pad_token_id": 32000,
   "transformers_version": "4.40.2"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:559704e9da8d819c84387e547a1e42ee323924e58074bb0600c77ae442cec227
 size 441691776

 version https://git-lfs.github.com/spec/v1
+oid sha256:993012c6bc1b09893ac6ca245a6af4483a906120bdfa53fdab18123ab4c529cc
 size 441691776