e-hossam96
/

arabic-nano-gpt-v1

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 8.3925
 ## Model description
@@ -41,19 +41,27 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 200
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 7.1603        | 6.6667  | 100  | 6.8927          |
-| 5.0605        | 13.3333 | 200  | 6.6411          |
-| 3.1601        | 20.0    | 300  | 7.0113          |
-| 1.6772        | 26.6667 | 400  | 7.4615          |
-| 0.7315        | 33.3333 | 500  | 7.8263          |
-| 0.3081        | 40.0    | 600  | 8.1310          |
-| 0.1618        | 46.6667 | 700  | 8.3925          |
 ### Framework versions

 This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 8.5767
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 100
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 6.6922        | 6.6667  | 100  | 6.8460          |
+| 4.4109        | 13.3333 | 200  | 6.6570          |
+| 2.7038        | 20.0    | 300  | 6.9939          |
+| 1.427         | 26.6667 | 400  | 7.3622          |
+| 0.6413        | 33.3333 | 500  | 7.6590          |
+| 0.2914        | 40.0    | 600  | 7.9181          |
+| 0.1481        | 46.6667 | 700  | 8.1451          |
+| 0.0928        | 53.3333 | 800  | 8.2888          |
+| 0.069         | 60.0    | 900  | 8.3402          |
+| 0.055         | 66.6667 | 1000 | 8.4368          |
+| 0.0478        | 73.3333 | 1100 | 8.4684          |
+| 0.0399        | 80.0    | 1200 | 8.5143          |
+| 0.0363        | 86.6667 | 1300 | 8.5179          |
+| 0.0329        | 93.3333 | 1400 | 8.5729          |
+| 0.0306        | 100.0   | 1500 | 8.5767          |
 ### Framework versions

config.json CHANGED Viewed

@@ -13,7 +13,7 @@
   "model_type": "gpt2",
   "n_ctx": 1024,
   "n_embd": 384,
-  "n_head": 4,
   "n_inner": null,
   "n_layer": 4,
   "n_positions": 1024,

   "model_type": "gpt2",
   "n_ctx": 1024,
   "n_embd": 384,
+  "n_head": 6,
   "n_inner": null,
   "n_layer": 4,
   "n_positions": 1024,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9d2bced74478bd6208d96eba4f4a348246ca50793ff900e7cf879ce24aeb08b1
 size 42555416

 version https://git-lfs.github.com/spec/v1
+oid sha256:1e5b6cbff43b9e7c3e4ecc8514c85e7bf1985855621676169c39a859167c7055
 size 42555416

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:572d2abfa72a4961f1e79a7ecba18b8a16f28bcb14b013ba341522f5fabc276c
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:3136559c3247ff5f3a551db29995826b7e6becd712db85d5f973849645b828fb
 size 5240